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ABSTRACT 

This  thesis  investigates  speech  recognition  in  a  command  and  control  workstation 
environment.  It  discusses  the  Navy's  need  for  a  command  and  control  workstation 
(CCWS)  and  the  importance  of  the  human  interface  design.  In  particular,  it  evaluates 
the  performance  of  Stanford  Research  Institute  International  (SRI's)  1000  word 
discrete  speech  recognizer.  The  speech  board  is  intended  to  be  used  in  the  Command 
and  Control  Multi-Media  workstation  being  developed  by  SRI.  Additionally,  it 
investigates  a  VOTAN  continuous  recognizer  (currently  in  use  by  research  and 
commercial  vendors)  in  an  interactive  warfare  simulation  game.  The  results  indi 
that  speech  recognition  systems  could  increase  the  capability  of  the  commander  to 
input  and  access  information,  provide  more  rapid  response  to  information  desired  or 
displayed,  and  enhance  human  interaction  in  the  man-machine  interface.  Past,  current. 
and  future  speech  applications  are  discussed. 
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I.  SPEECH  RECOGNITION  IN  A  COMMAND  AND  CONTROL 
WORKSTATION  ENVIRONMENT 

A.  INTRODUCTION 

Ever  since  the  rapid  influx  of  microcomputers,  there  has  been  increasing  incentive 
to  enhance  the  productivity  of  humans.  The  job  of  automating  routine  tasks,  acquiring 
and  communicating  information,  and  the  very  popular  intelligent  support  of  decision 
making  are  all  attempting  to  exploit  the  potential  of  these  machines.  Of  equal 
importance  is  the  growing  effort  to  enhance  the  productivity  of  humans  through  man- 
machine  interfaces  to  take  advantage  of  these  growing  capabilities. 

We  can  exchange  information  in  a  variety  of  methods.  Our  most  efficient 
communication  should  be  available  when  we  want  to  communicate  information  via  a 
computer.  It  has  long  been  known  that  speech  is  the  most  natural  and  fastest  form  of 
communication  for  us  and  therefore,  should  be  considered  as  the  unrivaled  interface  for 
system  optimization. 

Research  into  automatic  speech  recognition  systems  has  been  ongoing  for  over 
thirty  years.  Automatic  Speech  Recognition  (ASR)  is  defined  as  the  ability  for  the 
computer  or  device  to  correctly  recognize  spoken  words  and  translate  that  into  a 
predetermined  output  string  to  the  computer.  There  are  many  advantages  of  using 
voice  input.  The  most  important  of  these  characteristics  are  freeing  the  user's  hands 
and  eyes  for  other  tasks,  employment  in  low  light  or  dark  areas,  and  the  freedom  of 
movement  from  a  specified  location. 

From  this  list  of  advantages,  it  would  be  easy  for  us  to  let  our  imaginations 
wander  and  generate  a  listing  of  thirty  or  more  applications  for  voice  input.  Quality 
control  on  assembly  lines,  sorting  of  packages,  office  automation,  aircraft  control, 
disabled  control  of  wheel  chairs,  and  many  more  well  suited  examples  could  be 
enumerated.  The  focus  of  this  work  is  to  examine  speech  applications  in  the  area  of 
Command  and  Control  and  in  particular  a  Command  and  Control  Workstation 
(CCWS). 

B.  PURPOSE  OF  THE  THESIS 

Even  though  there  have  been  over  30  different  theses  accomplished  at  the  Naval 
Postgraduate    School   related    to    speech    recognition    systems    alone,    there    is    little 


awareness  of  speech  applications  in  the  naval  environment.  Evaluating  state  of  the  art 
systems  and  recommending  various  areas  for  speech  applications  in  a  shipboard 
environment  may  raise  the  awareness  of  this  technology  and  help  to  incorporate  speech 
technology  in  the  future  designs  of  man-machine  interface.  It  is  without  a  doubt  an 
area  of  technology  that  has  far  reaching  consequences  for  the  commander  in  the 
growing  age  of  computers. 

C.       SUMMARY 

This  thesis  describes  the  purpose  of  the  CCWS  in  the  Distributed  Command 
Support  System  and  the  key  role  of  speech  in  the  human  interface.  Basic  speech 
technology  past,  present,  and  future  is  described  in  Chapter  III.  A  description  of  the 
experiment  used  to  test  the  Stanford  Research  Institute  International  (SRI)  'Berkeley' 
1000-word  discrete  speech  recognizer  is  presented  in  Chapter  IV.  A  follow-on 
experiment  utilizing  a  commercially  available  VOTAN  continuous  speech  recognizer  is 
described  in  Chapter  V.  Finally,  conclusions  from  these  experiments  and  the  author's 
recommendations  for  additional  speech  applications  in  a  Command  and  Control 
environment  are  offered. 
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II.  ARCHITECTURAL  REQUIREMENTS  FOR  A  COMMAND  AND 
CONTROL  WORKSTATION  (CCVVS) 

A.       OVERVIEW 

This  chapter  will  investigate  the  needs  and  the  architectural  requirements  for  a 
Command  and  Control  Workstation  (CCWS).  The  particular  workstation  this  paper 
will  investigate  is  the  SUN  Microsystems  Computer  Model- 170  proposed  by  Stanford 
Research  Institute  (SRI)  for  the  U.  S.  Navy  needs.  This  paper  will  develop  the 
architectural  framework  needed  above  the  workstation  system  and  focus  on  the 
requirement  to  include  well  engineered  human  interfaces.  This  is  motivated  by  the 
immense  amount  of  information  flow  that  this  future  system  will  support.  Voice 
recognition  is  examined  as  a  potential  solution  to  the  growing  complexity  of  getting 
information  to  the  commander. 

In  every  C^  system  there  is  a  commander  who  sole  purpose  is  to  make  timely  and 
knowledgeable  decisions.  An  understanding  of  the  commander's  decision  process  is 
essential  to  ascertaining  what  the  CCWS  must  support.  Ever}'  new  technological 
advance  alters  the  balance  of  forces  and  must  be  carefully  considered.  The  CCWS 
design  seeks  to  create  an  advantage  by  integrating  a  multitude  of  sources  into  one 
system.  The  commander  must  be  able  to  exercise  control  over  these  combined 
resources.  He  must  obtain  the  various  data  in  a  form  that  he  can  best  digest.  This  is 
not  a  trivial  problem  as  the  amount  of  information  available  to  him  can  quickly 
overwhelm  his  staff  and  work  against  their  objective.  It  requires  a  systems  approach  in 
solving  the  problem  of  fusing  these  composite  sources  o[  data.  In  any  systems 
approach  one  must  understand  how  the  system  will  compliment  the  architectural  level 
above  and  the  layer  below.  We  will  begin  with  the  definition  of  some  relevant  terms 
and  an  examination  of  the  processes  and  structures  germane  to  the  CCWS. 

Much  has  been  written  to  define  Command  and  Control  (C  )  in  various  sources. 
In  lieu  of  adding  another  definition  to  the  growing  mass  we  will  use  the  Joint  Chief  of 
Staffs  Dictionary  to  delineate  C  . 


The  exercise  of  authority  and  direction  by  a  properlv  designated  commander  over 
assigned  forces  in  the  accomplishment  of  his  mission.  Command  and  control 
functions  are  performed  through  an  arrangement  of  personnel,  equipment, 
communication,  facilities,  procedures  which  are  emploved  bv  a  commander  in 
planning,  directing,  and  coordinating,  and  controlling  forces"  and  operations  in 
the  accomplishment  of  a  mission.    (JCS,  19S4) 
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As  defined  in  Dupuy  (1986),  command  is  the  authority  vested  in  an  individual  of 
the  armed  forces  for  the  direction,  coordination,  control,  and  administration  of  military 
forces,  and  control  is  the  authority  exercised  by  a  commander  over  the  activities  of 
subordinate  organizations  or  entities.  Since  a  computer  is  the  heart  of  CCWS,  we  will 
interpret  computer  as  a  machine  which  performs  electronic,  mathematical  manipulation 
of  new  inputs  and  existing  data  to  obtain  useful  outputs  in  near-real-time. 

In  the  simplest  terms  Command  and  Control  is  a  process  by  which  a  commander 
directs  his  resources  to  achieve  a  goal.  One  of  the  key  resources  is  the  information 
increasingly  provided  by  a  system  of  computers. 

B.       THE  COMMANDER'S  DECISION  PROCESS 

The  commander's  primary7  goal  is  the  accomplishment  of  the  mission.  He  must  be 
able  to  assimilate  copious  amounts  of  information  and  data.  Based  on  his 
understanding  of  tiie  situation  he  must  then  make  the  split-second  decision  for  which 
he  alone  is  accountable.  The  process  can  be  thought  of  as  a  continuous  loop  which 
observes  the  effect  of  the  decision  on  the  environment.  This  outcome  will  be  reflected 
in  the  data  or  information  obtained,  and  the  process  repeats. 

There  are  many  models  depicting  this  reiterative  decision  process  or  loop:  J. 
Lawson's  model,  Boyd's  OODA  loop  mentioned  in  Orr  (1983)  and  the  SHOR 
paradigm  mentioned  in  Wohl  (1981).  All  of  these  illustrations  are  merely  extensions  of 
the  stimulus  response  model  of  classical  behaviorists.  For  simplicity  and  to  align  this 
feedback  loop  to  the  basic  functions  of  a  shipboard  Combat  Direction  Center,  our  view 
of  the  maritime  commander's  decision  process  will  be: 

•  COLLECT--to  obtain  combat  information  from  all  available  sources 

•  PROCESS--to  sort,  review,  appraise,  and  correlate  all  information 

•  DISPLAY— to  present  the  information  that  best  serves  the  decision  maker 

•  EVALUATE-dccide 

•  DISSEMINATE-distribute  the  decision 

This  decision  process  can  be  at  any  level  in  the  command  structure.  For  the 
Commanding  Officer  of  a  ship,  Battle  Force  Commander,  or  even  the  Fleet 
Commander,  the  process  is  the  same.  These  loops  are  nested  within  each  other 
forming  a  hierarchy.  The  systems  and  processes  that  makeup  these  nested  loops  all 
work  toward  supporting  the  commander  in  directing  and  controlling  his  forces.  The 
design   of  a  C"   system  must   support  all  these  processes  in  a   timely  and  accurate 
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manner.  To  motivate  an  understanding  of  the  factors  needed  in  today's  information 
systems,  we  will  briefly  examine  the  background  leading  to  the  current  dilemma  in 
information  management  for  the  U.S.   Navy  maritime  commander. 

C.       NAVY  NEED 
1.  Brief  History 

A  primary  input  to  the  commander's  decision  process  is  monitoring  the 
environment.  This  process  within  the  decision  loop  is  supported  by  proper 
management  of  his  sensors  (collection),  the  processing  of  this  information  (process), 
and  presenting  the  information  useful  to  the  decision  maker  (display).  Historically,  the 
tactical  commander  relied  solely  on  the  organic  sensors  of  his  battle  group. 
Information  from  the  Fleet  Command  or  other  sources  was  spotty  at  best.  In  the  40's 
and  5u's',  the  technological  improvements  in  sensors  and  communications  equipment 
produced  a  huge  amount  of  information  for  the  commander.  There  was  an  early 
indication  that  the  unsupported  decision  maker  could  easily  be  overwhelmed.  As 
pointed  out  by  G.  A.  Miller  (1956)  in  a  psychological  review  ".  .  .  current  manual 
methods  oC  information  processing  incident  to  decision  making  may  be  inadequate,  and 
new  types  of  filtering  and  preplanning  will  be  required."   (Wohl,  1981) 

Through  the  years  following,  the  need  for  a  device  to  assist  the  commander 
became  even  more  apparent.  The  technological  advances  in  computers,  automatic  data 
processing  and  weaponry  were  overwhelming.  The  effect  of  longer  range  and  faster 
aircraft,  missiles,  and  guns  was  to  greatly  increase  the  area  of  responsibility  for  the 
commander.  The  protection  of  his  force  utilizing  the  ' Defense-in-Depth'  concept, 
consisting  of  a  surveillance  area,  engagement  area,  and  a  vital  area,  was  degraded  by 
his  inability  to  manually  track  all  the  contacts  in  these  areas  effectively.  Our  systems 
were  quite  inadequate  to  fully  support  the  decision  maker.  Even  the  Soviets  realized 
this  dilemma  as  evidenced  in  this  quote  from  the  General  of  the  Army  S.M.  Shtmenko. 
U.S.S.R  : 

The  volume  of  information  that  staffs  must  process  has  increased  many  fold  since 
World  War  II  and  the  time  allowed  for  decision  making  has  decreased  many  fold. 
As  a  result  the  requirements  on  the  "brain  capacity"  "of  commanders  and"  stalls 
have  increased  vastly.  To  meet  these  requirements  by  simplv  expanding,  the 
administrative  apparatus  is  fundamentals  impossible  .  .  .' .  The  "only  escape  from 
this  incompatible  situation  lies  in  the  extensive  application  p(  automation, 
pnmarilv  computers  ...  a  "man-machine"  system  is  more  perfect  than  "man" 
alone  of  "machine"  alone  ....  Information  technoloav  does  not  simplv  help  the 
commander  and  his  staff,  but  also  stimulates  the '"development  of 'collective 
militarv  creativitv.  in  which  the  largest  group  of  people,  including  those  separated 
by  great  distances,  can  participate. w(Druzhinin,  1972) 
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2.  Current  Deficiencies  in  U.  S.  Navy  Information  Processing 

The  U.S.  Navy  realized  by  the  end  of  World  War  II  that  the  current  combat 
information  center  (CIC)  was  quickly  becoming  outdated.  The  early  1960s  saw  the 
first  digital  computer,  Naval  Tactical  Data  System  (NTDS)  operational  in  the  fleet. 
This  was  a  revolutionary  step.  A  machine  had  been  connected,  through 
communications  links,  to  another  machine  to  pass  real-time  information  in  a 
operational  environment.  NTDS  is  an  automated  method  of  collecting,  processing, 
displaying,  and  disseminating  tactical  information.  Information  is  displayed 
graphically,  in  real-time  and  provides  the  shipboard  decision  maker  with  a  considerable 
amount  of  information  to  direct  his  weapon  employment.  As  time  progressed,  there 
were  tremendous  advantages  realized  in  obtaining  information  from  other  than  organic 
battle  group  sources  (e.g..  national  level  sensors).  This  led  to  many  ad  hoc 
improvements  to  the  system  that  were  outside  of  the  original  architecture  for  N'l  DS 
and  were  never  really  designed  to  interface  with  the  system.  Naturally,  saturation 
became  a  problem.  There  were  so  many  diiTerent  systems  that  often  sailors  were 
required  to  accommodate  the  differences  in  data  format,  information  fusion,  and 
sanitization  of  highly  classified  data  and  sources.  The  problem  was  summarized  in 
Local  Command  Center  Network  Statement  of  Work  (LCCN)  (197S)  as  follows: 


The  introduction  of  each  new  technology  development  (communications, 
weapons,  sensors,  electronic  warfare),  whether  bv  enemv  or  (riendlv  forces,  mav 
significantly  alter  the  manner  in  which  multiple  platforms  (ships,'  aircraft  and 
submarines')  can  be  most  effectively  coordinated.  The  proper  exercise  of 
command  and  control  in  this  changing  environment  requires  that  the  combined 
sources  of  data  be  presented  to  the  "commander  in  a  form  which  is  tailored  to  his 
resources,  mission,  and  surrounding  environment. 


3.  Systems  Approach 

The  ad  hoc  solutions  to  these  problems  of  coordination  and  interfacing  were 
complicating  rather  than  supporting  the  commander's  decisions.  The  increased 
sophistication  of  existing  systems  and  the  addition  of  new  requirements  have  caused 
the  individual  number  of  components  in  systems  to  drastically  increase.  A  systems 
approach  to  effectively  manage  and  assess  the  expanding  individual  systems  becomes 
quite  evident. 

The  large  quantity  of  information  from  national,  joint,  and.  or  Navy  sensors  is 
indispensable  to  the  commanders  in  the  field.  The  extended  battle  group  surveillance 
area  has  grown  proportionally  to  the  range  of  the  over-the-horizon  weapons  (both  the 
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enemy's  and  our  own)  and  global  sensors.  This  vital  data  is  available  from  many 
sources,  but  the  current  flow  of  information  makes  it  unobtainable.  The  information 
that  is  available  often  requires  manual  correlation.  As  a  result,  the  decision  process 
discussed  above  either  lacks  the  necessary  information  or  is  overwhelmed  by  the  reams 
of  unprocessed  data.  The  intent  of  the  Distributed  Command  Support  (DCS)  System 
is  to  reduce  the  information  processing  and  collection  load  through  correlation, 
tracking,  and  fusion  of  data. 

D.       SOLUTIONS:       A      NATIONAL      LEVEL      DISTRIBUTED      COMMAND 
SUPPORT  (DCS)  SYSTEM 

As  defined  by  Tanenbaum  (1981),  a  distributed  system  is  a  special  case  of  a 
computer  network  with  a  high  degree  of  connectivity,  cohesiveness.  and  transparency. 
It  could  be  a  stand  alone  system  or  on^  in  which  the  data  and  information  are 
available  to  anyone  in  the  network  wherever  they  may  be  located.  Its  application  in  a 
C~  environment  has  far  reaching  consequences. 

The  Navy  understood  its  deficiencies  in  information  exchange  and  the  potential 
in  computer  networks.  The  need  for  such  a  system  was  expressed  in  the  following 
Naval  Need  statement  by  Naval  Ocean  Systems  Command  (NOSC.  1985): 


Existina  and  planned  Navy  Svstems  (e.2..  sensors,  communications,  weapons  and 
C2  support  svstems)  are  'developed  as"'  stand-alone  svstems.  Coordination  and 
interpretation'  between  svstems  is  accomplished  is  a'n  ad  hoc.  svstem  unique 
manner  that  often  requires  manual  coordination.  Advances  'in  weapons. 
surveillance  and  detection  svstems  are  sisnificantlv  increasina  demands  on  the 
Naw  C2  svstems.  Therefore,  these  svstems  must  be  integrated  in  a  more 
adap'table,  interoperable  and  survivable  way. 

The  Distributed  Command  Support  Svstem  (DCS)  will  provide  the  command 
centers  with  a  more  complete  and  overall  combat  picture  from  both  afloat  and 
ashore  sources.  Throii2h  DCS.  commanders  will  be  provided  with  the  capabilitv 
to  extract  information  from  data  transfer  svstems.  combine  that  data  with 
artificial  intellicence  decision  aids,  and  selectively  present  combat  planning 
decision  aids  usiiis  communication  protocols  .... 


The  essence  of  the  problem  is  the  integration  of  a  wide  assortment  of  computers 
and  software.  A  non-degraded  operation  between  systems  as  well  as  a  stand  alone 
capability  was  envisioned.  The  DCS  network  as  detailed  by  NOSC  is  shown  in  Figure 
1.1 

The  DCS  system  is  the  integration  of  a  wide  variety  of  systems  from  an 
assortment  of  users  all  able  to  share  each  others  contributions  to  the  network.  Many 
of  these  systems  already  exist  with  several  planned  for  the  near  future  (FY  S7,  SS). 
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The  Local  Area  Network  (LAN)  is  the  heart  of  the  system  and  has  many  different 
methods  to  establish  connections  (e.g.,  satellite,  high  frequency,  ultra  high  frequency 
and  Department  of  Defense  Network  (DDN)). 
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Figure  1.1     Distributed  Command  Support. 

Computer  to  computer  systems  cannot  communicate  unless  they  are  compatible, 
for  instance,  operating  with  the  same  protocols.  If  they  are  not,  a  scheme  must  he 
developed  to  connect  them  and  at  the  same  time  minimize  the  effect  o(  the  changing 
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protocols  on  processing  speed.  "The  key  to  DCS  is  the  development  of  standard 
application  protocols  that  will  support  intra-  and  inter-platform  computer  to  computer 
tasking."  (NOSC  1985)  As  shown  in  Figure  1.1,  the  NIUs  or  Network  Interface  Units 
are  used  to  convert  the  protocols  of  one  system  to  be  compatible  with  another.  NIL'  is 
analogous  to  the  gateway  shown  at  the  bottom  of  Figure  1.1.  The  difference  is  that  a 
gateway  may  be  capable  of  connecting  two  or  more  networks. 

E.        FLEET  AND  BATTLE  GROUP  LEVEL:    CCWS 

As  shown  in  Figure  1.1.  the  CCWS  is  an  integral  part  of  this  network.  This  is 
where  the  commander  interfaces  to  the  system  and  as  such  is  the  focus  o[  the  rest  of 
this  thesis.  It  will  receive  all  the  information  on  the  network.  A  secure  computing 
project  will  make  it  possible  for  all  the  users  to  have  the  same  data  base  but  have 
access  only  to  those  data  elements  for  which  they  have  the  security  and  need 
requirements.  The  use  of  a  trusted  guard  will  control  the  access  to  the  data  base  and 
allow  secure  operation  of  the  system  with  various  levels  of  classification.  For  example, 
the  Fleet  Commander  may  have  global  access  and  unlimited  security  eligibility  while 
the  squadron  commander  will  have  theater  coverage  and  security  access  for  only 
specific  areas.  The  major  advantage  is  that  the  entire  data  base  will  be  in  every 
location  increasing  the  connectivity  and  cohesiveness  of  the  information. 

The  desirability  of  personal  computing  techniques  utilizing  a  distributed 
workstation  environment  for  the  support  of  command  and  control  operations  for  the 
U.S.  Navy  was  formally  initiated  in  early  July  1980.  SRI  International  was  tasked  with 
a  feasibility  study.  Computer  systems  and  technology  has  significantly  changed  since 
the  initial  study;  however,  the  basic  capabilities  and  design  considerations  have 
remained  intact.  The  capabilities  of  a  workstation  in  a  Command  and  Control 
distributed  network  as  pointed  out  by  Poggio  (1985)  should  be  : 

•  The  expeditious  acquisition  of  up-to-date  multi-media  information 

•  Flexible,  reliable,  timely  exchanse  of  information  among  people,  and  between 
people  and  processes. 

•  Rapid  match  of  information  transport  requirements  to  dvnamic  communication 
capacity 

•  Survivability  -  loosely  coupled  autonomous  systems 

These  capabilities  translate  directly  into  the  Distributed  Command  System  and  a 
battle  group  environment.  The  intelligence  gathered  from  outside  sources  would  be 
combined   with    the    sensor   information    provided    from   the    battle    group's    organic 
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equipments.  The  capability  for  many  users  to  simultaneously  plan,  decide,  and 
disseminate  information  in  a  multi-media  environment  will  greatly  enhance  the 
commander's  decision  process. 

In  addition  to  network  information,  the  system  is  designed  to  provide  decision 
support  systems  to  aid  the  commander  in  the  decision  process.  Refering  to  our  model 
of  the  decision  process  shown  in  Figure  1.2,  one  can  readily  see  that  the  CCWS 
supports  all  four  of  the  five  functions  and  assists  the  actual  commander's  decision. 
Today's  Battle  Group  Commander  should  have  at  his  disposal  all  the  available 
information  utilizing  the  technological  hardware  and  software  to  make  the  correct 
decisions  or  evaluations.  Therefore,  to  support  the  commander  we  should  allow  the 
computer  to  do  what  it  can  do  best  (i.e.,  fusion  of  data)  and  allow  the  human  to  do 
what  only  he  can  do,  make  the  decisions.  SKI  is  incorporating  these  ideas  into  a 
computer  based  multi-media  information  system.  The  current  design  of  the 
workstation  plans  to  accommodate  this  arrangement. 
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figure  1.2     Functional  View  of  the  Commander's  Decision  Process 

Supported  by  (  CWS. 


F.       HUMAN  INTERFACES  TO  CCWS 

Everything    meaningful    in    the    operation,    extraction,    and    manipulation    ol 
information  available  from  CCWS   results  from  human  interaction  with   the  display. 


Since  the  sole  reason  for  the  workstation  is  to  assist  and  extend  the  capabilities  of  the 
commander,  the  user  interlace  should  be  of  utmost  importance.  As  stated  in  NOSC, 
(19S5)  ".  .  .  the  man-machine  interface  must  be  more  natural  and  efficient,  readily 
adaptable  to  the  peculiarities  of  the  user  and  support  multi-media  (i.e..  voice,  graphics, 
text)  messages  and  information."  High  resolution,  bit-mapped  color  displays, 
sophisticated  window  and  cursor  controls,  and  speech  recognition  are  all  available  now 
for  implementation  in  these  personal  workstations. 

Figure  1.3  taken  from  Poggio  (19S5)  shows  SRI's  design  considerations  for 
several  man-machine  interface  components.  The  various  instruments  by  which  we 
communicate  instinctively  (speaking,  pointing,  and  writing)  are  all  available  in  these 
human  interfaces.  The  evaluation  and  enhancement  of  the  man-machine  interface, 
particularly  in  the  speech  realm,  is  the  focus  of  this  thesis. 
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Ficure  1.3     CCYVS  Man-Machine  Interface  Components 
(Adapted  from  Poggio,  1985). 
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1.  Voice  Entry 

Since  humans  have  such  a  propensity  for  talking,  it  is  only  logical  that  speech 
input,' output  would  be  one  of  the  ideal  man-machine  interfaces.  Automatic  Speech 
Recognition  (ASR)  is  defined  as  the  ability  for  the  computer  or  device  to  correctly 
recognize  spoken  output  and  translate  it  into  a  predetermined  output  string  to  the 
computer.  There  are  many  advantages  of  using  voice  input.  The  most  important  of 
these  advantages  is  freeing  the  user's  hands  and  eyes  for  other  tasks,  allowing  for 
increased  productivity  and  more  rapid  system  response  because  speech  input  is  faster 
than  conventional  keyboard  entry.  The  incorporation  of  ASR  enables  the  O  system 
to  be  a  true  extension  of^  the  commander's  decision  making  ability  utilizing  current 
technology,  his  organization,  and  its  procedures. 

2.  Automatic  Speech  Recognition  Requirements 

The  following  is  a  list  oi"  the  critical  requirements  necessary-  in  an  automatic 

speech  recognizer  for  incorporation  into  the  CCWS. 

Large  vocabulary  (capacity  >   1000  words). 

Real-time  response. 

Very  high  recognition  accuracy  (  >  98%). 

Adaptable  to  the  user,    (i.e.,  the  user  should  not  have  to  modify  or  alter  his 
speaking  rate  significantly) 

•      No  deterioration  in  accuracy  in  noisy  and  stressful  environments. 

These  specifications  are  believed  by  the  author  to  be  those  items  necessary  for  an 

effective  and  viable  speech  recognition  system.    The  minimum  capacity  of  1000  words 

was  specified  since  this  was  a  previous  goal  set  in  1971  by  the  Department  of  Defense. 

(Barr  and  Feigenbaum,  1981)  An  accurate,  versatile,  and  fast  large  vocabulary  system 

which  adapts  readily  to  any  user  should  be  the  goal  oi"  all  manufacturers  of  automatic 

speech  recognizers.    Consequently,  this  list  will  be  the  criteria  for  final  evaluation  of 

the  SRI  1000  word  discrete  recognizer  and  the  VOTAN  continuous  word  recognizer. 

Since  each  speech  recognizer  is  different,  it  is  crucial  that  those  responsible  for  the 

man-machine  interface  spend   sufficient   resources  in  defining  the  requirements  of  a 

particular  system  and  finding  the  correct  speech  system  to  match. 

G.       CONCLUSION 

The  sole  purpose  of  a  command  and  control  system  is  to  support  the 
commander's  decision  process.  The  current  system  (NTDS)  is  overwhelmed  by  the 
amount  of  information  it  must  process  and  is  proliferated  with  ad  hoc  equipments  that 
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were  never  really  designed  to  be  interfaced  with  this  system.    An  inadequate  system 
exists  for  today's  commander. 

A  systems  approach  utilizing  the  technological  advances  in  distributed  networks 
and  personal  computing  led  to  the  development  of  DCS  and  CCWS.  The  workstation 
in  development  will  incorporate  the  latest  in  protocols  and  will  focus  on  supporting  the 
operational  commander.  The  system  design  is  to  take  full  advantage  of  the  man- 
machine  interfaces.  Since  our  fastest  and  most  efficient  means  of  communication  is 
speech,  it  is  only  justifiable,  that  the  design  of  the  CCWS  should  consider  speech 
input  output  interfaces.  This  will  ensure  that  the  architecture  for  the  command  and 
control  workstation  is  designed  to  be  a  true  extension  of  the  commander. 
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III.  SPEECH  TECHNOLOGY  PAST,  PRESENT  AND  FUTURE 

A.  OVERVIEW 

This  chapter  will  describe  the  basic  types  of  speech  recognition  systems  and  a  few 
of  the  fundamental  terms  associated  with  these  systems.  The  history  of  speech 
input  output  systems  and  forecasts  of  the  future  of  speech  technology  are  discussed  in 
broad  detail.  It  is  important  to  realize  that  each  automatic  speech  recognizer  uses 
different  algorithms.  The  user  must  be  thoroughly  familiar  with  the  particular  system 
to  ensure  that  it  is  the  correct  equipment  for  the  task  and  that  proper  training  and 
programming  of  the  system  has  been  achieved.  A  basic  familiarity  with  the  terms  and 
the  types  of  speech  recognition  systems  is  essential  in  comprehending  this  rapidly 
growing  technological  field. 

B.  DEFLNITION  OF  TERMS 

Before  discussing  speech  recognition  systems,  we  need  to  define  and  discuss  the 
various  generic  types  of  speech  systems.  As  shown  in  Figure  2.1  there  are  two  major 
types:  speaker  dependent  and  speaker  independent.  A  speaker  dependent  systems  relies 
entirely  on  the  user  training  the  speech  recognition  system.  The  user  speaks  an 
utterance  (one  or  more  words  in  a  phrase)  usually  1-5  times  for  each  word  or  a 
particular  output  string.  The  equipment  translates  the  frequency  vs.  time  output  into  a 
normalized,  digital  matrix.  Depending  on  the  manufacturer,  these  may  be  manipulated 
by  some  averaging  algorithm  or  just  stored  as  separate  templates  in  memory  or  in  a 
data  base.  A  template  is  the  digital  representation  or  matrix  of  the  utterance  which  is 
used  by  the  device  to  compare  against  your  spoken  word.  Each  system  uses  different 
algorithms  to  calculate  the  template  and  a  thorough  understanding  of  the  algorithm 
used  by  the  device  is  required  to  maximize  recognition  through  proper  training. 

When  a  particular  utterance  is  spoken,  it  is  compared  against  the  template  in 
memory  and  if  it  is  within  a  pre-established  limit  or  threshold,  the  device  performs  the 
function  the  user  has  installed  on  the  system.  If  it  does  not  meet  the  threshold  level, 
the  utterance  is  rejected  and  nothing  is  sent  by  the  recognizer.  Additionally,  there  are 
two  other  events  which  can  occur:  an  insertion  or  a  substitution  error.  An  insertion 
occurs  when  a  recognition  takes  place  due  to  spurious  noise  or  an  utterance  other  than 
those  that  are  legitimate  entries  in  the  data  base.    For  example,  if  you  said  'defcon'  or 
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Figure  2.1     Automatic  Speech  Recognition  Systems. 

a  similar  word  NOT  in  your  database  and  the  system  recognizes  and  outputs  the  string 
for  'defense'.  A  substitution  on  the  other  hand  occurs  when  your  input  utterance  is 
calculated  as  a  closer  match  to  a  different  template  in  storage,  thus  incorrectly 
recognizing  another  word.  Tor  example,  if  'defcon'  and  'defense'  ARE  currently  in  the 
database  and  the  utterance  'defcon'  produces  the  string  'defense'.    (Pallett,  1985) 

The  speaker-dependent,  template  matching  systems  are  the  most  common 
systems  on  the  market.  A  system  trained  to  a  particular  individual  can  achieve 
recognition  accuracies  of  90-99  percentile.  On  the  other  hand,  a  speaker  independent 
system  contains  algorithms  which  are  robust  enough  for  any  individual  to  be  correctly 
recognized.  Such  a  device  requires  no  training  since  each  word  is  represented  by 
templates  which  are  an  average  of  a  wide  range  of  different  utterances  selected  by  the 
manufacturer.  Depending  on  the  size  and  limitations  of  the  vocabulary,  recognition 
accuracies  are  slightly  less  than  those  experienced  by  the  speaker  dependent  systems. 
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The  goal  of  most  speech  recognition  manufacturers  and  researchers  is  to  develop  a 
large  vocabulary  recognizer  which  is  independent  of  the  user.   (Poock,  1986b) 

Each  of  these  two  categories  is  further  subdivided  into  three  separate  categories: 
discrete,  connected,  and  continuous.  A  discrete  system  or  isolated  word  system  as  its 
name  implies  is  one  in  which  the  user  must  pause  for  a  predetermined  time  (about  .1 
sec)  between  consecutive  utterances.  The  device  establishes  the  start  and  endpoint  of 
the  word.  These  utterances  are  compared  to  what  is  in  memory  and  the  output  string 
is  sent  once  the  recognizer  has  calculated  the  best  match. 

The  connected  speech  system  requires  no  pauses  between  utterances.  The  system 
is  continually  checking  what  is  spoken  and  what  is  in  memory.  As  the  word  or  phrase 
is  recognized,  the  device  is  loading  that  particular  string  into  the  output  buffer.  Once 
the  user  pauses,  the  system  unloads  all  that  it  has  accumulated  in  the  butter. 

In  contrast,  the  continuous  system  outputs  the  prescribed  string  immediately  upon 
recognition  and  does  not  wait  for  a  pause  from  the  user.  Even  though  there  appear  to 
be  no  apparent  word  boundaries,  the  device  is  able  to  calculate  matches  and  produce 
the  output  strings.  This  is  much  harder  than  discrete  recognition  since  there  are  major 
changes  which  occur  in  the  pronunciation  of  words  at  the  word  boundaries  known  as 
co articulation.  These  are  differences  in  speech  patterns  not  found  in  isolated  or  discrete 
word  pronunciation. 

Manufacturers  today  are  still  not  in  agreement  over  exactly  what  constitutes  the 
difference  between  these  last  two  types.  As  stated  earlier  each  and  every  system  is 
different  and  must  be  thoroughly  tested  and  analyzed  to  ascertain  exactly  what  the 
manufacturer  is  trying  to  represent  in  his  literature. 

C.       PAST 

Many  of  the  larger  technical  companies  like  IBM.  Philco-Ford,  RCA,  and  Bell 
Telephone  Laboratories  started  research  back  in  the  early  50's  and  60's.  It  was  not 
until  the  early  70's  that  the  first  products  commercially  available  were  offered  by 
Threshold  Technology,  Inc.  and  Scope  Electronics.  (Poock.  19S6b) 

Concurrently  in  the  early  1970's,  the  U.  S.  Department  of  Defense  Advanced 
Research  Projects  Agency  (ARPA)  funded  a  five-year  program  in  speech  understanding 
research  (SUR). 

ARPA  funded  five  speech  projects  and  several  subcontracts  for  developing  parts 
of  speech-systems.  Some  q{  the  major  ARPA  contractors  produced  multiple 
systems  during  the  five-year  period:    Work  at  Bolt.  Beranek  and  Newman,  Inc. 
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(BBN)  produced  first  SPEECHLIS  and  then  HWIM  (Hear  What  I  Mean), 
building  on  earlier  BBN  research  on  understanding  natural  language.  Carnegie- 
Mellon' Universitv  (C.M.U.)  produced  the  HEAKSEY-I  and  URAVOX  svstems 
in  the  earlv  development  phase  (1971-1973)  and  the  HARPY  and  HEARSEY-II 
programs  bv  1976.  SRI  International  also  developed  a  speech  understanding 
program,  pa'rtlv  in  collaboration  with  Svstems  Development  Corporation  (SDCk 
(Barr  and  Feigenbaum,  19S1) 


The  ARPA  projects  were  all  built  for  the  purpose  of  developing  a  speech 
understanding  device,  but  they  varied  considerably  in  levels  of  difficulty,  number  of 
speakers,  ambient  noise,  etc.  As  a  result  of  this  effort  there  was  considerable  progress 
made  toward  practical  speech-understanding  systems.  One  of  the  most  important  ideas 
to  surface  from  these  projects  was  the  influence  of  Artificial  Intelligence  {AT)  research 
and  system  architecture.  The  researchers  found  phonetic  recognition  was  the  most 
promising  answer  to  continuous  speech  understanding,  but  at  the  time  they  did  not 
have  the  computing  power  necessary  nor  was  it  as  straight  forward  as  initially 
anticipated.  Since  the  early  success  of  speech  recognition  used  template  matching, 
industry  abandoned  the  harder  track  of  speech  phonetics. 

D.       PRESENT 
1.  Overview 

Currently  there  are  literally  thousands  of  organizations  in  the  United  States 
and  around  the  world  exploiting  speech  systems.  From  controlling  robot  arms  on  the 
space  shuttle  to  incorporation  into  children's  toys,  speech  input  output  systems  are  in 
daily  use  and  are  growing  rapidly.  Despite  ARPA's  efforts,  up  until  now  all  the  speech 
systems  have  consisted  of  relatively  small  quantity  vocabulary7  pattern  matching  or 
template  matching  techniques.  The  better  systems  can  be  expected  to  have  recognition 
accuracies  of  better  than  97%. 

There  are  several  periodicals  like  the  Journal  of  The  American  Voice  I  0 
Society  and  Speech  Technology  Man' Machine  Voice  Communications  which  reflect  the 
latest  in  research,  applications  of  speech  processing,  and  product  reviews.  In  fact  in  a 
recent  edition,  there  were  193  different  companies  listed  providing  various  products 
and  or  services  in  the  speech  field.  Speech  recognition  today  is  extremely  capable  and 
reliable  and  could  be  applied  to  thousands  of  areas  with  more  awareness  and 
understanding  of  its  benefits  to  both  user  and  management. 
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2.  Speech  Applications  in  Command  and  Control 

Application  of  speech  recognition  systems  in  a  shipboard  environment  need 
not  stop  with  the  CCWS.  There  are  many  other  areas  where  using  this  technology 
could  be  beneficial.  In  the  Combat  Direction  Center,  manipulating  NTDS  displays  and 
functions  on  these  consoles  by  voice  in  conjunction  with  the  trackball  tab.  computer 
controlled  action  entry  panel  (CCAEP),  digital  data  entry  unit  (DDEU),  and  category 
select  panel  would  allow  users  to  more  quickly  disseminate  information  and  result  in 
less  operator  fatigue.  Data  retrieval  by  the  Commanding  Officer  or  Tactical  Action 
Officer  to  display  decision  aids  or  threat  matrices  by  voice  could  promote  better 
weapon  or  countermeasures  selections.  The  automatic  speech  recognizer  could  allow 
the  commander  to  focus  totally  on  the  display. 

Combat  Direction  Center  is  not  the  only  area  on  the  ship  that  could  benefit 
from  speech  recognition  systems.  A  voice  activated  expert  system  for  controlling 
engineering  propulsion  plant  casualties  would  greatly  enhance  the  reduced  manning 
policy  on  the  automated  gas  turbine  powered  ship  classes.  Remote  activation  of 
damage  control  (DC)  or  firefighting  equipment  by  personnel  outside  the  damaged 
space  could  reduce  the  risk  of  damage  to  sailors  and  equipment. 

The  list  could  continue.  Salfer  (1985)  presents  a  more  detailed  analysis  of 
applications  of  ASR  systems  onboard  the  FFG-7  class  ships  which  could  be  expanded 
to  include  other  classes  of  ships  as  well.  The  underlying  reason  for  pointing  out 
various  other  areas  for  speech  applications  is  to  stimulate  awareness  and  generate  other 
ideas  for  applications  for  this  technology. 

It  is  important  to  note  regardless  of  how  much  faster  or  better  a  system  can 
work  employing  automatic  speech  recognition  technology,  if  the  user  and  management 
do  not  have  the  motivation  to  examine  such  a  system,  this  equipment  like  others  would 
have  no  hope  for  success. 

E.       FUTURE 

Speech  recognition  in  no  way  should  be  considered  stagnant.  Manufacturers  and 
corporations  are  more  than  ever  wanting  to  reap  the  benefits  of  this  technological  field. 
As  the  awareness  and  knowledge  of  this  technology  becomes  more  widespread 
especially  in  man-machine  interface,  a  greater  proliferation  of  systems  will  be  seen. 

The  new  horizon  for  speech  recognition  systems  is  to  move  away  from  template 
matching   schemes  to   the  more  flexible  phonetic  recognition.     The   basis  of  phonetic 
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systems  is  phonemes  the  basic  units  of  all  speech.  Once  the  system  is  trained  on  words 
utilizing  all  the  combinations  of  phonemes,  the  formulation  of  any  word  is  possible. 
For  example  this  phrase,  taken  from  Speech  Systems  Incorporated  advertising  literature. 

continuous  speech  development  toolkit 

would  look  like  this  phonetically. 

kantinyuasspichdivelapmentulk.it. 

The  phonemes  are  then  converted  by  different  syntactic  and  dictionary  builders  in  a 
computer  which  produce  the  correctly  formulated  string.  At  the  1986  American  Voice 
Input; Output  Society  (AVIOS)  convention,  there  was  only  one  vendor  Speech  Systems 
Incorporated  who  was  marketing  a  phonetic  recognition  system.  It  is  the  first 
commercial  system  oi"  its  type.  It  is  surely  the  trend  of  future  speech 
recognition  understanding  systems  and  it  is  one  focus  of  the  Department  of  Defense 
funding. 

In  addition  to  industrial  and  university  research,  Defense  Advanced  Research 
Projects  Agency  (DARPA.  formerly  ARPA).  is  sponsoring  another  multi-million  dollar 
contract  titled  Strategic  Computing  Program.  A  major  part  of  the  Strategic  Computing 
Program  is  the  integration,  transition,  and  performance  evaluation  of  speech 
technology.  "The  speech  recognition  portion  of  the  Strategic  Computing  Program  is 
divided  into  two  major  areas:  continuous  speech  recognition  and  robust,  connected- 
word  recognition  .  .  ."   (Strategic  Computing,  1985). 

The  aim  of  this  program  is  to  make  continuous  speech  recognition  a  realization. 
The  major  thrust  would  be  in  the  area  of  phonetic  recognition  to  deal  with  speaker 
variation,  large  vocabularies,  natural  grammars,  and  real  time  response.  In  the  area  of 
robust  speech  recognition,  the  objectives  are  to  improve  upon  current  system's  capacity 
to  deal  with  variations  and  distortions  of  the  input  speech  signal  in  severe  acoustic 
noise  and  physiological  psychological  stress  found  in  military  applications.  (Strategic 
Computing  Program,  1985) 

Increased  use  of  computers  in  problem  solving  will  demand  more  emphasis  on 
man-machine  interfaces.  Speech  recognition  will  be  that  interface  which  makes  the 
computer  a  true  extension  of  man.  We  communicate  with  each  other  by  speech,  so  it 
should  only  be  expected  we  can  do  the  same  via  a  computer.  This  cursory  look  at 
speech  types  and  speech  related  terminology  is  meant  only  to  familiarize  the  reader 
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with  terms  to  be  used  later  and  to  introduce  the  ever  broadening  future  of  speech 
input/ output  systems. 
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IV.  TEST,  ANALYZE,  AND  EVALUATE  THE  SRI  BERKELEY'  SPEECH 

BOARD 

This  chapter  describes  a  series  of  tests  whose  purpose  was  to  confirm  the  voice 
recognition  performance  of  the  SRI  'Berkeley'  board  as  reported  in  Murveit  (1986). 
The  results  of  the  SRI  study  suggest  that  a  1000-word  discrete  speech  recognition 
system  does  not  sacrifice  accuracy  despite  the  high  processing  speeds  necessary  for 
large  vocabulary  recognition.  Their  report  indicates  that  tiie  Berkeley  speech  board 
system  achieved  a  recognition  accuracy  of  over  90  percent  for  a  1000  word  vocabulary 
and  over  99  percent  for  a  sixteen  word  vocabulary.  In  addition  this  chapter  will 
examine  the  algorithms  used  by  the  speech  board  for  initial  template  creation,  voice 
recognition,  and  error  correction. 

A.  DESCRIPTION 

SRI  selected  the  'Berkeley'  board  because  it  was  the  state  of  the  art  in  large 
vocabulary  speech  recognition.  A  recognizer  of  this  type  was  a  necessary  requirement 
in  a  CCWS  for  a  faster  and  more  natural  man-machine  interface  in  command  entry 
and  database  access.  Specifically,  the  research  conducted  by  SRI  was  for  the 
enhancement  of  speech  interfaces  for  natural-language  data-base-management  tools. 
In  cooperation  with  U.C.  Berkeley.  SRI  modified  the  design  slightly  and  interfaced  it  to 
the  SUN- 170  Microsystems  computer. 

B.  THE  SUN- 170  MICROSYSTEMS  WORKSTATION 

The  SUN- 170  Microsystem  workstation  is  a  UNIX  based  computer  system. 
These  workstations  are  used  in  a  variety  of  applications.  The  value  of  workstations 
was  realized  with  the  increase  in  computer  power  provided  by  the  development  of  16 
and  32  bit  microprocessors.  A  typical  workstation  will  generally  consist  of  a  1  MIPS 
(million  instructions  per  second)  CPU,  2-4  Megabytes  of  memory,  a  high  resolution 
(1000  by  1000  pixels)  display,  a  keyboard,  and  a  mouse.  The  speech  board  is  interfaced 
to  the  SUN  and  receives  the  audio  input  directly. 

The  workstation  used  in  this  experiment  is  the  host  computer  on  the  Department 
of  Defense  Network  (DDN)  at  address  SRI-BOZO.  There  are  several  inherent 
attributes  like  file  transfer  protocol  (FTP)  and  telenet  (TN)  resident  on  the  DDN 
network  which  allowed  remote  work  on  the  vocabulary  and  data  processing  from  the 
Naval  Postgraduate  School. 
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C.  MARA 

MARA  is  the  hardware  and  software  components  that  integrate  the  speech 
recognizer  into  the  workstation.   The  MARA  system  consists  of: 

•  the  computer  and  its  programs 

•  the  speech  recognizer 

•  the  user 

The  MARA  hardware  consists  of  a  Multibus  PC  board,  a  backplane  with  a 
connector,  a  BXC  cable,  a  pre-amplifier,  and  a  microphone.  The  software  components 
include: 

•  The  PC  board  pvogram-maraS6.com 

•  The  MAM  Daemon-mara 

•  The  Low  Level  Recognition  command  lihY^ry-libmara.a 
■      The  Standard  Mbrdiry -libmara.a 

•  Support  libraries  for  various  applications-libmarawindow.a 

The    MARA    system   in   the    broadest    sense    is    the   combination   of  equipment    and 
programs  that  are  referred  to  as  the  SRI  'Berkeley'  board.   (Kavaler,  1986) 

D.  THE  SRI  BERKELEY'  BOARD 

The  speech  recognition  board,  as  its  name  implies,  is  a  single  circuit  board.  This 
board  is  built  with  a  multibus  interface  and  is  modified  to  be  inserted  directly  into  the 
SUN  Microsystems  computer  workstation.  The  speech  board  is  divided  into  two 
separate  subsystems.  The  front-end  subsystem  manipulates  the  input  into  a  form  to  be 
analyzed  by  a  comparator  subsystem  where  the  voice  templates  are  stored. 

1.  Front  End 

The  utterance,  in  the  form  of  a  frequency  vs.  time  signal,  enters  thru  a  series 
of  16  bandpass  filters.  The  outputs  are  rectified  and  then  low-passed  filtered  over  a 
period  of  time.  The  signal  is  then  divided  into  10  millisecond  frames.  Each  frame  ".  .  . 
is  the  average  voltage  a  speech  signal  has  in  several  frequency  bands.  The  system 
computes  speech  frames  at  a  rate  of  one  hundred  times  a  second."  (Murveit,  19S6) 
During  the  process  of  computing  the  frames  it  checks  for  whether  or  not  a  word  is 
really  being  spoken  (referred  to  as  endpoint  detection).  Assuming  that  a  word  is  being 
spoken,  the  system  varies  the  spectral  sampling  rate  dynamically.  The  spectral 
difference  of  adjacent  frames  are  then  compared,  and  if  the  distance  is  insignificant 
then  the  frame  is  discarded.     This  technique  is   called  selective  downsampling  and  it 
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reduces  the  data  rate  through  the  system,  particularly  the  long  steady-state  sounds  in 
words.  The  result  of  disregarding  the  insignificant  frames  in  this  manner  is  improved 
accuracy,  real  time  vocabulary*  processing,  and  expanded  template  storage  memory. 
The  front  end  subsystem  then  downloads  the  frames  into  the  comparator. 
2.  Comparator 

As  the  name  implies  this  subsystem  compares  the  incoming  frame  with  those 
already  in  memory.  This  is  accomplished  by  a  technique  called  dynamic  time  warping. 
The  input  frames  are  compared  with  the  reference  frames  of  the  words  in  memory. 
The  sum  of  the  differences  of  their  spectral  distances  is  computed.  A  score  or  cost  for 
each  and  even.'  word  in  memory  is  then  computed  and  the  minimum  value  is  sought. 
The  lower  the  score  computed  by  the  algorithm  the  better  the  recognition.  As 
discussed  in  Chapter  3,  if  the  score  is  below  a  rejection  threshold  then  the  string 
specified  for  the  word  is  output.  If  the  word  score  is  above  this  value  a  non- 
recognition  occurs. 

E.  SUBJECTS 

One  civilian  and  one  military*  officer  participated  in  the  testing  of  the  SRI  speech 
board.  Both  subjects  were  male  32  to  46  years  old.  The  civilian  (Ml)  was  very 
experienced  with  many  types  and  models  of  speech  recognition  systems,  while  the 
military  officer  (M2)  had  less  than  12  hours  total  exposure  to  speech  systems. 

F.  TRAINING  ALGORITHM 

The  training  was  conducted  in  a  low  noise  speech  lab  at  SRI  utilizing  a  SHURE 
SM-10  close-talking  microphone.  A  training  algorithm  was  used  to  develop  the 
templates  for  each  speaker.  This  speaker  dependent  system  requires  the  user  using  the 
the  training  algorithm  to  specify  how  many  training  passes  are  desired  as  well  as  the 
"cluster"  size  and  method  of  input.  This  would  allow  one  to  input  utterances  from  a 
tape  recording  and  have  the  algorithm  form  templates  on  a  fixed  number  of  passes 
from  the  recording.  The  cluster  size  is  an  averaging  technique  which  is  the  essential 
ingredient  in  creating  templates.  To  form  a  cluster,  an  initial  template  (the  first 
training  pass  usually)  is  compared  against  another  utterance  for  that  word  or  phrase. 
The  spectral  distance  is  calculated  and  compared  to  the  initial  utterance(s)  in  memory. 
If  the  minimum  average  distance  is  less  than  the  distance  specified  in  the  algorithm, 
then  one  template  is  formed.  Otherwise  the  system  will  indicate  that  a  template  could 
not  be  formed  since  the  spectral  clusters  were  outside  the  limits.    The  trainer  program 
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then  will  prompt  for  more  repetitions  in  an  effort  to  generate  a  single  template.  If  after 
three  more  repetitions  a  single  template  still  could  not  be  created  from  the  additional 
utterances,  two  templates  for  the  same  word  are  computed.  Each  template  and  spoken 
word  is  placed  alphabetically  in  a  Unix  directory.  The  templates  are  indicated  by  file 
type  .//  while  the  utterances  are  identified  by  a  .uJ.  For  example  if  the  word  "advisory" 
is  spoken  twice  in  creating  one  template  one  would  find  the  files  advisory. tl,  advisory. u  1 
and  advisory.ul.  This  is  unique  to  this  system  and  the  advantages  of  this  scheme  will  be 
evident  later  in  this  chapter. 

G.       THE  VOCABULARY 

Any  vocabulary  file  can  be  created  by  specifying  the  word  prompt  followed  by 
two  colons,  then  the  keystrokes  or  output  string.  This  file  is  in  the  working  director}' 
and  is  specified  when  invoking  the  trainer  algorithm.  In  this  particular  experiment  the 
subjects  used  a  100  word  initial  vocabulary  taken  from  the  1000  word  set  used  by  SRI 
(Appendix  A).  A  second  vocabulary  which  was  used  in  extensive  studies  conducted  at 
the  Naval  Postgraduate  School  (Poock,  1981,  1986a)  was  sent  directly  to  the  SUN 
workstation  at  the  host  (SRI-BOZO)  via  the  DDN.  This  vocabulary  of  240  utterances 
is  shown  on  the  data  sheet  in  Appendix  B.  It  is  divided  into  five  groups  of  words 
based  on  the  number  of  syllables.  There  were  10%  one  syllable  words,  30%  two 
syllable  words,  20%  three  syllable  words,  20%  four  syllable  words,  and  20%  five  or 
more  syllable  words.  These  words  were  selected  from  commands  typically  used  in  a 
command  center. 

H.       PROCEDURE  AND  DATA  COLLECTED 

Several  different  testing  periods  were  scheduled  over  a  three  month  period.  Both 
subjects  traveled  to  the  SRI  International  building  in  Palo  Alto,  Ca.  to  participate  in 
the  testing.  The  session  started  by  logging  onto  the  SRI-BOZO  net  via  the  Sun 
Microsystems  Computer  terminal.  The  appropriate  windows  were  displayed  and  the 
MARA  system  was  automatically  enabled  during  the  login  sequence. 

The  trainer  program  was  used  only  once  for  each  vocabulary.  One  user  (MI) 
used  three  training  passes  while  the  other  user  (M2)  only  used  two  passes.  There  was 
no  need  throughout  the  three  months  to  retrain  the  vocabularies.  A  selective  retraining 
of  several  words  was  accomplished  to  demonstrate  the  ease  of  retraining  or  adding  new 
words. 
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Under  the  main  directory  of  XPS  were  the  subdirectories  of  templates 
POOCK.TEMPLATES  and  MIKE.TEMPLATES.  The  word  recognition  program  was 
enabled  and  the  file  of  100  words  or  240  words  was  called.  The  program  automatically 
searched  the  alphabetical  subdirectories  and  loaded  the  proper  templates  on  to  the 
speech  board.  It  took  an  average  of  130  seconds  to  load  the  240  word  templates.  For 
data  collection  purposes  each  session  was  recorded  to  a  file  with  the  lowest  five  words 
and  their  scores  for  each  utterance.  When  possible- the  other  subject  would  record 
errors  as  he  witnessed  them  to  confirm  the  recorded  data.  Additionally,  any 
abnormalities  or  peculiarities  the  system  would  display  would  be  more  apparent  to  the 
observer  and  thus  free  the  subject  to  concentrate  on  the  word  list. 

In  an  effort  to  demonstrate  the  robustness  of  the  system,  the  different  lists  were 
read  with  varying  speeds.  The  vocabulary  was  tested  forward,  backward,  and  randomly 
at  both  a  normal  speaking  rate  and  then  at  a  significantly  quicker  pace.  In  addition. 
the  subjects  attempted  to  demonstrate  the  interoperability  of  the  same  voice  patterns 
between  the  two  subjects  by  using  each  others  templates.  A  joint  template  was 
attempted  but  due  to  the  relatively  small  spectral  distance  allowed  in  the  training 
algorithm  cluster  averaging  technique,  after  four  passes  no  single  joint  template  could 
be  created. 

Several  runs  were  conducted  in  a  noisy  environment.  A  cassette  tape  of 
machinery  noise  was  played  at  a  level  of  74  db(A)  at  the  microphone.  This  level  is 
considerably  higher  than  one  could  expect  in  a  command  and  control  environment 
even  in  a  shipboard  tactical  decision  center. 

The  vocabulary  can  easily  be  modified  by  editing  the  file.  If  a  file  is  modified  to 
include  a  word  not  yet  trained,  the  speech  program  indicates  that  it  could  not  find  a 
template  for  that  word.  Otherwise,  it  would  load  any  template  that  was  specified  in  the 
vocabulary  regardless  of  whether  or  not  it  was  trained  at  the  same  time  or  a  part  of 
another  vocabulary. 

During  one  of  the  testing  periods,  the  subjects  used  a  syniactic  feedback  system 
demonstrated  by  SRI  to  NAVELEX  in  July  1934.  (Murveit.  1986)  The  syntactic 
feedback  system  is  a  specially  designed  algorithm  to  correct  recognition  errors  in  a 
sentence.  The  grammar  is  structured  as  a  finite  state  machine  with  beginning,  end.  and 
transition  states.  The  program  would  compute  the  least-cost  path  through  a  scries  of 
weighted  arcs  and  then  select  the  recognized  sentence.  For  instance,  in  a  data  base 
query  if  a  word  or  words  were  misrecognized  by  the  recognition  system,  it  could  be 
corrected  by  the  syntactic  feedback  algorithm. 

33 


Throughout  the  testing  period  it  was  evident  that  a  good  background  in  the 
UNIX  operating  system  and  familiarity  of  the  MARA  system  were  major  prerequisites 
to  effective  use  of  the  speech  recognition  system.  Software  improvements  in  user 
interaction  and  a  well  written  operating  manual  for  reference  would  have  been  helpful. 

I.        RESULTS 

1.  Accuracy 

Results  for  the  1000  word  vocabulary  tests  conducted  by  SRI  reported  in 
Murveit  (19S6)  are  shown  below  in  Table  1.  Ml,  M2,  Fl,  and  F2  refer  to  individual 
male  and  female  subjects..  The  percentages  refer  to  word  recognition. 


TABLE  1 

SRI  1000  WORD  RECOGNITION  PERFORMANCE 

Ml 

89-91    % 

M2 

89-93    % 

Fl 

91-93    % 

F2 

86-90   % 

The  data  shown  in  Table  2  reproduced  from  Murveit  (1986),  reflect  the  results 
of  SRI's  speech  recognition  system  utilizing  the  TI-20-word  data  base  used  to  test 
commercial  speech  recognition  systems.  (Doddington  and  Schalk.  1981) 

The  results  of  the  tests  conducted  by  our  subjects  appear  in  Tables  3  through 
6.  These  tables  represent  the  trials  with  the  variability  in  speech  speed  and  no 
maximum  rejection  threshold  specified. 

A  two  sample  T  test  utilizing  an  Arcsin  Transformation  criteria  was  completed 
using  MI  SI-TAB  statistics  package  showing  no  significance  between  the  two  means  of 
our  subjects  at  the  0.05  level  of  significance.   (Minitab,  1981) 
2.   Interoperability  of  Voice  Patterns  for  Different  Users 

The  results  of  the  interoperability  tests  are  shown  in  Table  5  showing  an 
obvious    decrease    in    accuracy.     The    computed    scores    or    differences    between    the 
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TABLE  2 

SRI  TI  DATA  BASE  PERFORMANCE  (ERRORS  OF  320) 

16  SPEAKERS          TOTAL  13  ERRORS 

.25% 

MEAN  ERROR  RATE 

320  UTTERANCES 

EACH 

TABLE  3 
NTS  100  WORD  VOCABULARY  TEST 


Ml 

94-98  % 

8  TRIALS 

AVG   96  % 

M2 

91-99  % 

12  TRIALS 

AVG   97  % 

TABLE  4 

NTS  240  WORD  VOCABULARY  TEST 

Ml 

95-100%           7  TRIALS              AVG   97  % 

M2 

9S-100  %           7  TRIALS              AVG    99  % 

recognized  words  and  the  templates  were  on  the  average   10  points  higher  than  the 
mean  of  their  scores  with  their  own  templates. 
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TABLE  5 

INTEROPERABILITY  TESTS 

Ml  using  M2  Templates                                     80-89  % 

3  TRIALS 

M2  using  M2  Templates                                     78-86  % 

3  Trials 

3.  Accuracy  in  a  Noisy  Environment 

The  endpoint  deieciion  process  which  is  computed  in  the  front  end  section  of 
the  card  also  keeps  track  of  the  background  noise  level  and  effectively  ".  .  .  eliminates 
moderate  room  noises  and  maintains  proper  signal  levels  in  the  converter  and  analysis 
circuits."  (Murveit,  19S6)  The  background  noise  elimination  features  oi"  the 
microphone  and  the  system  allowed  it  to  perform  with  virtually  no  degradation  in 
recognition  performance.  It  is  interesting  to  note  that  the  system  was  not  capable  of 
any  recognition  at  approximately  76db(A).  Table  6  shows  the  results  in  a  noisy 
environment. 


Ml 
M2 


TABLE  6 
NOISY  ENVIRONMENT 


99  % 
96-98  % 


2  TRIALS 
2  TRIALS 


J.        SYNTACTIC  FEEDBACK 

The  subjects  during  oi\g  testing  session  exercised  the  syntactic  feedback  system 
using  a  limited  vocabulary  and  allowable  sentence  structure.  There  are  a  number  of 
questions  which  are  suggested  by  Murveit  (1986).  These  issues  should  be  pursued, 
since  there  is  an  increase  in  accuracy  realized  in  using  this  algorithm. 
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K.       CONCLUSIONS  AND  RECOMMENDATIONS 

The  purpose  of  these  tests  was  to  examine  the  voice  recognition  performance  of 
the  SRI  'Berkeley'  1000-word  discrete  speech  recognition  board.  The  results  of  our 
testing  confirms  the  results  reported  by  SRI  Project  6096.  (Murveit,  1986)  Their 
1000-word  speech  recognition  system  is  very  accurate  and  quite  fast.  Throughout  the 
entire  study,  no  degradation  of  the  templates  occured.  The  experiment  was  conducted 
entirely  on  initial  templates.  Despite  the  variability  in  speaking  rate,  three  months  of 
broken  testing,  and  testing  in  a  noisy  environment,  the  system  performed  proficiently. 

However,  the  SRI  'Berkeley'  board  in  its  present  configuration  does  not  meet  all 
the  requirements  necessary  to  be  a  viable  interface  in  the  CCWS.  In  spite  of 
commercial  discrete  speech  recognition  system  vendors  advertising  an  input  rate  of  60 
words  minute,  discrete  speech  recognition  systems  are  not  suitable  for  a  Command  and 
Control  environment.  The  user  must  modify  his  speaking  rate  by  pausing  alter  each 
utterance  to  effectively  use  the  system.  It  would  be  insensitive  to  the  ultimate  users  in 
a  CCWS  environment  to  assume  that  discrete  utterances  in  a  high  tempo,  high 
pressure,  and  possibly  high  threat  situation  is  even  remotely  acceptable.  A  connected 
or  even  a  continuous  speech  recognition  system  is  the  only  suitable  alternative.  This 
gives  the  Commander  the  best  opportunity  to  process  information  quickly  and 
accuratelv  allowing  him  more  time  to  enact  a  timelv  and  knowledgeable  decision. 
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V.  TEST,  ANALYZE,  AND  EVALUATE  A  COMMERCIAL  CONNECTED 
VOICE  RECOGNITION  SYSTEM  IN  A  WARGAMING  ENVIRONMENT 

The  previous  chapter  analyzed  the  reliability  of  a  1000  word  discrete  speech 
recognition  system.  The  SRI  speech  board  is  a  state-of-the-art  system  which  was  quite 
good  and  very  accurate.  The  disadvantage  was,  of  course,  utilizing  a  discrete  system  in 
a  command  and  control  environment. 

The  purpose  of  this  chapter  is  to  analyze  the  performance  oC  a  relatively 
inexpensive,  commercially  available  continuous  speech  system.  The  VOTAN  6050 
Model  II  product  was  examined  for  its  applicability  and  adaptability  to  a  command 
and  control  environment  in  a  particular  Naval  Warfare  Interactive  Simulation  System 
(NYV15S).  VOTAN  has  been  used  in  many  experiments,  tests,  and  applications  and  is 
regarded  by  many  as  a  very  capable  speech  recognizer.  For  example,  in  the  Navy's  air 
traffic  control  trainer  and  simulator  this  same  recognizer  was  demonstrated  and 
performed  quite  well.  The  VOTAN  was  used  in  this  experiment  to  focus  on  four  major 
areas: 

(1)  An  application  of  a  continuous  speech  svstem  in  a  Command  and  Control 
environment  similar  to  a  workstation  module. 

(2)  Investigate  anv  significant  differences  in  the  ability  to  input  commands  by 
speech  or  keyboard  entry. 

(3)  Investigate  the  possibility  of  utilizing  a  speech  recognition  svstem  in  Naw 
Tactical  trainers  to  overcome  the  dead  time  in  learning  the  game  command 
keystrokes  and  entry  procedures. 

(4)  Investigate  anv  significant  differences  in  speed  of  command  entrv  for  users 
with  familiarity  with  standard  Navy  phraseology  versus  those  unfamiliar  with 
using  speech  recognition  systems. 

There  is  considerable  time  expended  at  every  tactical  trainer  by  the  users  in 

familiarizing  themselves  with  the  equipment   and  game  command  entry  procedures. 

This  'dead'  time  could  be  eliminated  by  using  a  standardized  vocabulary  as  used  in 

Navy  contact  reporting  procedures  and  incorporating  speech  recognition  to  minimize 

keyboard  operation  and  special  game  commands.    The  result  would  be  an  increase  in 

useful  tactical  trainer  time.     Before  examining  the   VOTAN   speech   system  we   will 

briefly  describe   NWISS   and  the   similarities   to   the  proposed   specifications   for  the 

Command  and  Control  Workstation  (CCWS). 
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A.       DESCRIPTION  OF  THE  NAVAL  WARFARE  INTERACTIVE  SIMULATION 
SYSTEM  (NWISS) 

NWISS  is  a  real-time,  user-interactive  simulation  of  naval  warfare.  Its  mission 
was  originally  to  train  senior  Naval  Officers  in  force-level  tactical  decision  making  and 
management  of  command  and  control.  The  NWISS  game  resides  on  a  VAX  11,780 
computer,  and  a  network  of  peripheral  VT100/102,  ADM31  terminals  and  RAMTEK 
graphics  terminals  to  provide  the  necessary  displays  and  interactive  stations.  The 
equipment  is  located  in  the  Naval  Postgraduate  School  Wargaming  Analysis  and 
Research  (WAR)  Laboratory.  There  is  a  sufficient  amount  of  equipment  to  support 
three  separate  bays  or  areas  to  simulate  disjoint  command  and  control  modules. 

The  equipment  available  in  the  wargaming  and  research  laboratory  is  very  similar 
to  the  equipment  for  the  CCWS.  The  Distributed  Command  System  (previously 
shown  in  Figure  1.1).  shows  the  Interim  Battle  Group  Tactical  Trainer  (IBGTT).  which 
is  a  component  to  be  interfaced  into  the  local  area  network.  NWISS  is  to  be 
integrated  into  the  IBGTT  network  in  1987.  In  applying  a  continuous  speech 
capability  on  the  NWISS.  we  can  analyze  the  requirements  for  a  continuous  speech 
system  in  a  C    environment. 

The  RAMTEK  monitor  is  the  display  system  used  in  the  NWISS  modules.    The 

presentation  is  nothing  more  than  a  typical  Naval  Tactical  Data  System  (NTDS) 

picture  with  some  exceptions  and  is  similar  to  the  display  envisioned  for  the  CCWS. 

All  ships,  planes,  and  submarines  are  displayed  utilizing  standard  Navy  symbology  as 

shown  in  Figure  5.1,  with  some  differences.    The  exceptions  to  standard  shipboard 

NTDS  console  display  are  summarized  below: 

NWISS  has  color  enhanced  symbology  (An  excellent  screen  improvement). 

The  track  symbology  in  NWISS  does  not  reflect  engagement  status  of  tracks. 

Track  information  is  available  onlv  on  display  boards  and  is  not  accessible  from 
the  graphic  display  screen. 

Electronic  (ESM)  and  acoustic  (SONAR)  emissions  lines  of  bearing  are  color 
coded  as  well. 

Old  tracks  change  to  yellow  to  indicate  a  fading  track. 

NWISS  does  not  have  representative  svmboloev  available  in  NTDS  to  indicate 
type  of  platform. 

NTDS   has   balltab   capability   for  immediatelv   obtaining   information   on  the 
status  of  tracks. 

The  color  scheme  displays  all  known  friendly  forces  in  blue,  enemy  forces  in  red,  and 

unknown  contacts  in  white,  with  a  fading  tracks  indicated  in  yellow. 
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Figure  5.1     NTDS  Symbology. 

B.       SCENARIO 

The  scenario  for  the  NWISS  game  was  designed  to  place  subjects  in  situations 
requiring  the  input  of  many  combinations  of  the  various  commands  available.  It  was 
the  first  exposure  for  most  of  the  subjects  to  a  multi-threat  Naval  wargame  since  it  was 
the  introductory  simulation  course  for  students  of  the  Naval  Postgraduate  School 
Command  and  Control  curriculum.  Each  group  of  students  embarked  in  separate 
aircraft  carriers  or  command  and  control  modules.  1  lie  objective  was  designed  to 
demonstrate: 

•  High  Resolution  Color  Graphics 

•  Friendly  man  -  machine  interface 

•  The  level  of  detail  required  to  plan,  run,  summarize,  and  analyze  a  relatively 
low  level  waruame 


40 


•      The  N'PS  WAR  Lab  capabilities 

Additionally,  the  purpose  of  each  of  the  runs  was  to  familiarize  the  subjects  with 
the  game  and  experiment  with  the  various  commands  and  display  boards.  The  actual 
situation  briefing  used  in  these  tests  is  included  in  Appendix  C. 

C.       VOTAN  SPEECH  RECOGNITION  SYSTEM  MODEL  6050  SERIES  II 

The  VOTAN  VTR  6050  Series  II  is  a  stand  alone  unit  which  can  interface  with 
any  system  supporting  a  standard  RS-232  port.  It  has  the  ability  to  operate  in  two 
distinct  modes:  Voice  Terminal  (VTR)  and  Voice  Peripheral  (VP).  The  VTR  mode 
allows  the  equipment  to  interface  directly  between  a  terminal  and  a  host.  This  is  the 
mode  that  was  used  in  the  NWISS  game  with  an  ADM 3 1  terminal  and  the  VAX 
11  780  as  host.  The  configuration  to  run  NWISS  with  the  VOTAN  appears  in  Figure 
5.2.  The  VP  mode  is  designed  for  telephone-based  applications.  This  mode  was  not 
used  in  this  experiment  and  will  not  be  discussed. 


CONFIGURATION     TO     RUN 
NWISS     WITH     VOTAN 


VAX  11/780 

ADM31 

J 
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ERMINAL                                       HOST 

Figure  5.2     Configuration  To  Run  NWISS  With  The  VOTAN. 

1.  Vocabulary  Size 

The  VOTAN  6050  Series  II  has  three  internal  components  which  support  its 
vocabulary.    These  are: 

•  VTR  System  Memory     (approximately  500K) 

•  Floppy  Disk  Memory     (maximum  of  76()K) 

•  Voice  Card  Memory       (maximum  o(  22K) 
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In  addition  to  these  components  there  is  also  the  possibility  of  storing  voice 
files  on  the  host  computer.  This  was  not  used  since  the  vocabulary  was  small  enough 
to  be  stored  directly  in  system  memory.  The  average  word  or  template  uses  200-250 
bytes  of  memory.  When  the  system  is  fully  loaded,  there  can  be  2000-3000  words  in 
main  memory.  It  is  important  to  note  that  all  voice  recognition  takes  place  on  the 
voice  card.  The  voice  card  can  accommodate  up  to  50  words  (from  the  2000-3000  in 
main  memory')  at  a  time.  A  tradeoff  can  be  seen  in  the  number  of  words  vs.  the 
number  of  templates  for  each  word.  The  more  accuracy  required,  the  more  templates 
needed  for  each  word,  and  the  fewer  words  loaded  into  each  active  set. 

The  main  memory  can  contain  multiple  sets  and  takes  only  about  150  msec  to 
upload  sets  onto  the  voice  card  memory7.  This  can  be  done  by  tailoring  the  vocabulary 
to  switch  automatically  upon  hearing  a  switch  word  or  can  be  automatically  switched 
when  a  certain  number  of  word(s)  are  recognized  from  an  on-line  set.  A  switch  is  a 
mnemonic  that  is  spoken  by  the  user  to  load  the  voice  card  with  a  specific  set  of 
templates.  This  file  is  transferred  at  a  rate  of  9600  baud.  During  the  upload  period  the 
VTR  is  automatically  recording  speech  (up  to  7  sees)  to  be  searched  immediately  upon 
completion  of  the  swap.  It  is  extremely  fast  and  is  virtually  unnoticed  by  the  user.  It 
is  recommended  in  VOTAN  Guide  To  Procedures,  that  one  should  limit  the  number  of 
words  in  a  set  to  about  10  to  20.  A  set  of  this  size  will  optimize  recognition  and 
provide  a  quicker  system  response  time. 
2.   Programming 

The  VTR  6050  Series  II  can  be  easily  programmed.  The  key  element  in 
optimizing  the  performance  of  the  system  is  careful  construction  of  the  vocabulary  so 
as  not  to  exceed  the  voice  card  memory  limitations  and  to  minimize  set  changes.  With 
the  VTR  in  the  off-line  mode,  (which  blocks  any  keystrokes  from  going  to  the  host),  a 
vocabulary  is  entered  directly  onto  the  screen  in  an  editor  mode.  The  user  specifies  the 
file  name  and  then  begins  entering  headings  for  the  word  sets  followed  by  the  actual 
words  in  the  set.  The  following  is  an  example  of  a  small  file  which  is  included  to  show 
the  various  programming  commands  available:    (VOTAN  GTP  and  UG,  1985) 

EDT  NUMBERS        -(this  allows  you  to  enter  the  EDITOR) 

"(mode) 
S-NUMBERS,  *(this  specifies  the  set  name  NUMBERS) 

NS  =  COLORS,  -(this  is  the  pointer  to  the  NEXT  SET:) 

*(COLORS  which  is) 
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CT=  2,  *( automatically  loaded  after  2) 

precognitions  of  this  set) 
CM  ^indicates  NUMBERS  is  a  COMMON) 

*(word  always  in  memory) 
ONE,HS  =  1  "(ONE  is  the  prompt  and  1  is  the) 

*( string  sent  to  the  host) 
TWO,TS=2  *(TWO  is  the  prompt  and  2  is  the) 

*(  string  to  the  terminal) 
THREE.TS  =  3\20       *(the  \20  is  the  hexadecimal  string) 

*(for  space  to  be) 

*(sent  to  the  terminal  after  the  3) 
FOUR,HS  =  4  -(FOUR  is  the  prompt  and  4  is  the) 

"(string  to  the  host) 

Appendix  D  is  the  listing  of  the  vocabulary  used  for  the  NWISS  game  and  will  be 
discussed  later  in  the  chapter. 
3.  Operation 

While  the  VOTAN  6050  Series  II  is  still  in  the  off-line  mode,  the  user's 
vocabulary  and  templates  are  placed  into  memory.  In  addition  to  the  set  in  memory 
there  are  certain  words  called  TASK  WORDS  which  control  operation  of  the  VTR 
when  it  is  on-line,  and  a  collection  of  words  in  the  user  tailored  vocabulary  which  can 
be  indicated  as  COMMON  words  that  are  also  a  part  of  the  total  allowed  templates  on 
the  voice  card  memory.  The  user  can  specify  an  initial  word  set  that  will  be  activated 
each  time  the  system  is  initialized.  Additionally,  the  user  can  specify  whether  or  not 
data  buffering  should  be  used.  Data  buffering  allows  the  system  to  store  a 
predetermined  number  of  strings  or  characters  before  outputting  them  to  the  host. 
Data  buffering  can  be  extremely  beneficial  when  a  user  needs  to  verify  a  string  of 
words  prior  to  being  sent.  Numerous  military  situations  require  validation  of  codes  or 
strings  to  ensure  proper  actions  upon  receipt.  The  default  condition  is  immediate 
action  when  the  word  or  phrase  is  recognized.  These  are  some  one  time  preliminary 
set-up  inputs.  Once  this  is  accomplished  the  system  is  ready  to  be  put  in  the  on-line 
(ONL)  mode.  This  sends  the  host  string  directly  to  the  computer  upon  recognition. 
These  keystrokes  are  then  returned  by  the  host  and  displayed  on  the  screen.  The 
keyboard  can  still  be  used  and  the  VOTAN  is  transparent  to  the  user  when  passing 
these  kevstrokes  directlv  to  the  host. 
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4.  Training  Algorithms 

The  VOTAN  6050  Series  II  offers  two  types  of  training  algorithms: 
single/discrete  training  and  continuous  training.  In  the  single  training  mode,  one 
template  is  formed  after  each  utterance.  The  continuous  training  method  extracts 
templates  from  a  series  of  passes  for  each  word  in  the  set.  This  takes  into  account  the 
coarticulation  of  a  word  at  the  beginning,  middle,  and  the  end  of  a  group  of  words. 
Prior  to  entering  the  continuous  training  mode,  the  user  must  have  at  least  two  single 
trained  templates  available  for  template  extraction  to  occur.  The  user  specifies  the  set 
which  he  would  like  for  continuous  training.  The  algorithm  then  automatically  selects 
up  to  ten  words  at  a  time  and  presents  to  the  user  a  series  of  five  of  these  words  in 
random  order  on  the  screen.  The  user  repeats  all  five  words  in  a  continuous  manner. 
It  will  then  display  two  columns  of  words  if  a  sufficient  number  of  words  were 
recognized.  The  first  column  lists  the  words  that  were  displayed  as  the  prompts.  '1  he 
second  column  contains  the  words  that  the  system  recognized. 

Several  misrecognitions  may  be  observed;  however,  the  algorithm  uses  the 
other  correctly  recognized  words  for  forming  the  extracted  templates.  This  ability  to 
develop  these  extracted  templates  enables  VOTAN  to  make  the  claim  of  having  a 
continuous  recognizer.  The  operator  can  manipulate  the  presentation  during 
continuous  training  to  ascertain  the  progress  of  completion  of  a  recognition  matrix  for 
the  current  set  of  words  being  trained.  The  matrix  has  three  columns  for  each  word 
indicating  where  the  word  occured  in  a  string  of  words  (i.e..  beginning,  middle,  or  end). 
There  are  some  training  passes  where  there  will  be  an  insufficient  number  of  words 
recognized  and  the  system  will  prompt  the  user  to  continue  training  a  new  set.  After  a 
certain  number  of  passes  or  when  the  matrix  is  completely  filled,  the  program  will 
terminate  the  training  of  that  word  group  and  continue  with  the  next  set  often  words. 

Prior  to  operating  the  system  in  VTR  mode  which  transmits  the  output  strings 
to  the  host  or  terminal,  the  user  can  invoke  a  program  to  test  his  templates  and  to 
ensure  voice  card  storage  has  not  been  exceeded.  The  output  display  upon  recognition 
consists  of  the  recognized  prompt  characters  and  the  recognition  score.  The 
recognition  score  is  computed  from  the  spectral  distances  between  the  template  and  the 
spoker  word.  Like  the  SRI  system  the  lowest  score  is  the  best  recognized  word.  The 
recognizer  has  a  minimum  recognition  threshold  default  of  50,  but  the  user  can  modify 
this  value  if  desired.   This  level  appears  to  be  quite  adequate  for  most  applications. 
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D.  SUBJECTS 

Six  male  officers  participated  in  this  experiment.  Five  were  Naval  Officers  from 
various  communities.  Three  had  previous  experience  with  the  modeled  systems  and 
were  familiar  with  the  terminology  of  giving  similar  orders.  These  were  the  individuals 
used  in  validating  the  area  of  familiarity  with  battle  group  phraseology  vs.  having  no 
experience.  All  but  one  of  the  officers  had  less  than  12  hours  total  exposure  to  voice 
recognition  systems.  The  other  officer  had  about  100  hours  experience  with  various 
voice  systems. 

E.  THE  VOCABULARY 

The  vocabulary  for  the  NWISS  wargame  consists  of  two  major  groups  of 
commands:  DISPLAY  and  ACTIVE.  The  DISPLAY  commands  control  all  aspects  of 
the  graphic  plot  as  displayed  on  the  RAM TEK  monitor.  The  'active'  commands 
consist  of  many  different  orders  that  could  be  given  to  ships,  submarines,  and  aircraft. 
There  are  actually  a  total  of  230  allowed  words  that  are  recognized  by  the  NWISS 
game.  The  NWISS  game  requires  that  the  commands  be  ordered  in  a  particular  way. 
For  example,  after  activate,  the  game  would  expect  to  see  7  different  commands,  and 
would  disallow  other  inputs.  These  same  words  could  appear  in  different  positions  in 
different  correct  commands  to  the  host  (this  plurality  in  commands  occurs  throughout 
the  vocabulary).  In  addition,  the  number  of  options  after  identifying  a  force  name  can 
range  up  to  50-60  possible  commands,  greatly  exceeding  the  limitation  of  the  voice 
card.  This  peculiarity  required  a  more  general  tailoring  of  the  vocabulary7  to  model  the 
NWISS  word  structure,  since  one  could  not  tailor  the  vocabulary  into  finite  sets 
allowing  only  a  small  number  of  words  to  follow  other  words.  It  is  a  similar  problem 
experienced  by  SRI  in  formulating  the  valid  structures  used  in  formulating  the  finite 
states  used  in  the  syntactic  feedback  system.  Consequently,  this  made  it  impossible  to 
formulate  the  vocabulary-  within  the  memory  and  template  limitations  without  multiple 
switch  words. 

Appendix  D  is  the  listing  of  the  vocabulary  used  in  this  experiment.  Note  that 
there  are  six  major  vocabularies  or  sets:  Display,  Ships,  Commands  to  Units. 
Numbers,  Aviation,  and  Load.  This  was  done  to  minimize  the  number  of  switches 
necessary  for  full  use  of  the  commands.  For  example,  an  actual  voice  command  for 
activating  an  air  search  radar  utilizing  the  VOTAN  would  be: 

SHIPS  SPRUANCE  ACTIVATE  AIR  NUMBERS  1245  ENTER.  (6  sees) 
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The  bold  words  are  the  switch  words  for  the  two  sets. 
The  same  command  by  keyboard  entry  is: 
FOR  SPRUA  ACTIVATE  AIR  1245  <cr> 
(28  keystrokes)  (~  10  sec)  (NWISS,  19S3) 

F.       PROCEDURE 

The  training  was  conducted  in  the  C  WAR  Lab  at  the  same  input  terminal  to 
be  used  for  the  game.  A  SHU  RE  SM-10  close  talking  microphone  was  used  for  the 
training  and  game  play.  The  subjects  used  in  the  experiment  were  trained  in  individual 
sessions  on  the  VOTAN  speech  recognizer.  The  training  took  place  in  one  session 
which  averaged  approximately  75  minutes.  The  enrollment  started  by  loading  copies  of 
the  commands  as  shown  in  Appendix  D  in  active  memory  without  any  templates.  An 
overview  of  how  the  training  was  to  be  conducted  was  given  including  proper 
microphone  placement  and  description  of  the  vocabularies. 

Each  subject  started  by  generating  two  single  trained  templates  for  the  set  of 
NUMBERS,  (this  set  included  all  numbers  0-9  and  letters  A-Z).  The  set  NUMBERS 
was  anticipated  to  require  continuous  training  because  of  the  extensive  use  of  alpha- 
numerics  in  commands.  Following  the  individual  training  of  this  set,  the  continuous 
training  algorithm  was  invoked.  Displaying  the  continuous  training  matrix  during 
training  led  to  the  discovery  that  the  algorithm  is  not  sophisticated  enough  to 
determine  exactly  what  order  it  should  present  the  group  if  there  are  only  a  few  unfilled 
blocks  left  in  the  matrix.  This  can  be  time  consuming  especially  if  the  processor  is 
experiencing  some  difficulties  in  developing  an  extraction  template  for  a  particular 
word.  Upon  completion  of  continuous  training  there  were  now  five  templates  for  each 
word  in  the  set.  It  became  apparent  that  this  number  would  far  exceed  the  number 
allowed  on  the  speech  card  and  therefore  all  single  templates  were  erased.  The 
remaining  words  were  presented  for  two  sets  of  single1  discrete  training  passes. 

After  all  word  sets  were  trained,  each  set  was  displayed  with  the  total  number  of 
templates  and  memory  used.  Task  words'  and  'common'  words  reside  on  the  voice 
card  at  all  times.  In  all  cases,  three  of  the  six  possible  sets  had  exceeded  usable 
memory,  as  shown  in  Table  7. 

A  review  of  the  vocabulary  and  sets  showed  that  28  words  were  duplicated 
intentionally  in  the  composition  of  the  sets.    This  design  redundancy  was  to  reduce  the 
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TABLE  7 
INITIAL  TEMPLATE  LISTING 


VOCABULARY 
SET 

MEMORY 

(BYTES) 

AVERAGE 

TASK_WORDS 

1651 

COMMON 

2916 

NUMBERS* 

1S740 

COMMANDS  JTOJJNITS 

23675 

AVIATION 

25382 

SHIPS 

S900 

LOAD 

16229 

AVERAGE 
NUMBER  OF 
TEMPLATES 


8 

IS 

1  jj 

148 

142 

50 

86 


SINGLE  TRAINED  TEMPLATES  NOT  INCLUDED 

EXCEEDS  VOICE  CARD  LIMITATIONS 
(COMMON  AND  TASK_WORDS  INCLUDED) 


number  of  switches  needed  for  the  formulation  of  proper  commands.  Consequently, 
there  were  actually  four  separately  trained  templates  for  these  words  in  storage.  Two 
of  these  templates  for  these  words  were  deleted  from  the  active  sets.  In  every  case,  an 
average  of  45  additional  templates  were  deleted  to  bring  the  memory  and  number  of 
templates  allowed  within  limits.  The  words  that  were  reduced  to  only  one  template 
were  those  words  with  many  syllables  and  that  were  readily  recognized.  The  actual 
number  removed  varied  according  to  the  user  and  the  way  each  word  was  enunciated. 
That  is.  if  utterances  were  fairly  slow,  more  memory  was  required.  Table  8  depicts  the 
average  final  number  of  templates  and  memory  remaining  in  the  actual  individual  files 
for  all  users.  The  final  test  was  to  invoke  the  trainer  program  and  ensure  there  were  no 
memory  overflow  or  template  overflow  errors  produced  as  the  different  sets  were 
loaded  onto  the  voice  card.  It  is  recognized  that  having  to  delete  templates  causes  a 
corresponding  decrease  in  recognition  and  is  a  significant  limitation  imposed  by  the 
svstem. 
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TABLE  8 
REVISED  TEMPLATE  LISTING 


VOCABULARY 
SET 

MEMORY 

(BYTES) 

AVERAGE 

TASK_WORDS 

1651 

COMMON 

2916 

NUMBERS 

13761 

COMMANDS_TO_UNTTS 

16953 

AVIATION 

17304 

SHIPS 

8900 

LOAD 

15S54 

AVERAGE 
NUMBER  OF 
TEMPLATES 


18 

100 

104 

95 

50 

85 


Each  subject  had  no  further  training.  At  the  start  of  the  -game  the  subject's 
revised  templates  were  loaded  into  the  recognizer.  They  were  allowed  to  perform  their 
roles  by  inputting  commands  as  necessary. 

The  short  time  available  to  conduct  the  tests  precluded  evaluating  the 
interoperability  of  data  sets  (i.e.,  one  user  operating  from  another's  voice  templates). 
Although  the  system  was  not  designed  to  accomplish  this,  it  is  a  point  of  interest  when 
evaluating  systems  in  a  command  and  control  environment.    The  purpose  is  that  in  the 

event  of  a  mishap  to  the  active  operator  a  slow  transition  to  another  operator  would 

2 
have  a  negative  impact  on  the  C    center  operation.    The  time  to  exchange  vocabularies 

from  one  user  to  another  was  62  seconds. 

The  level  of  noise  in  the  module  was  not  measured,  but  during  the  conduct  of  the 
exercise  the  noise  in  the  groups  during  discussions  and  administration  was  very  similar 
to  those  encountered  in  a  real  command  and  control  center.  The  VOTAN  gain  can  be 
easily  adjusted  if  necessary. 

Additionally,  the  240  word  vocabulary  (Appendix  B)  was  loaded  into  the 
VOTAN.  A  comparison  of  speech  recognition  accuracy  of  the  VOATAN  vs.  SRI  is 
shown  in  Table  9  using  subject  M2  from  the  previous  tests.    The  240  word  vocabulary 
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was  loaded  into  5  sets  and  with  an  average  number  of  96  templates  and  19575  bytes  of 
memory  per  set  to  simulate  the  conditions  present  for  the  NWISS  vocabulary'.  It  is 
evident  from  the  data  that  exceeding  the  manufacturers  recommendations  of  loading 
does  in  fact  effect  performance. 


TABLE  9 
SRI  VS  VOTAN  240  WORD  RECOGNITION  ACCURACY  TEST 

M2  SRI    99  %  VOTAN   97.4  % 


G.       RESULTS 

The  experiment  set  out  to  focus  on  four  separate  areas: 

(1)  Demonstrate  an  application  of  a  continuous  speech  system. 

(2)  Investigate  any  significant  differences  in  the  ability  to  input  commands  by 
speech  "or  keyboard  entry. 

(3)  Investigate  the  possibility  of  utilizing  a  speech  recognition  system  in  Navy 
Tactical  trainers  to  overcome  the  dead  time  in  learning  the  game  command 
keystrokes  and  entry  procedures. 

(4)  Investigate  anv  significant  differences  in  speed  of  command  entry  of  users 
familiar  with  standard  Navy  phraseology  versus  those  unfamiliar  with  using 
speech  recognition  systems. ' 

The  results  from  the  three  separate  runs  and  data  collected  with  the  constraints 

described  show  that  the  VOTAN  in  its  present  configuration  was  unable  to  adapt  to 

this  C*"  environment.    This  is  primarily  due  to  the  limitations  ol  storage  and  processing 

power  of  the  voice  card.    The  NWISS  vocabulary  is  not  suited  for  designing  a  distinct 

branching  method  of  words  from  one  set  to   other  sets  for  correct  formulation  of 

commands.     This    inability    to    establish    a    tree    architecture    for    correct    command 

structure,  resulted  in  the  number  of  words  in  most  sets  exceeding  the  recommended 

number  by   3.5   times.     As   discussed   in  the   technical  documentation  and  discussed 

earlier,  the  optimum  number  of  10-20  words  would  increase  recognition  and  provide  a 

quicker  response  time.    With  an  average  number  of  55  words,  the  reaction  time  was 

inordinately  slow  and  misrecognitions  were  higher  than  expected.    Speed  of  speech 

input  as  stated  by  Kavaler  (1986).  is  a  function  of: 

•       Speech  rate 
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•  The  processing  power  of  the  speech  recognizer 

•  '     The   constraints   placed   on   the  way  the   user  must   speak  (i.e.,   discrete   vs. 

connected,  number  of  'switch  words'J. 

Subjects  entering  commands  by  voice  with  these  constraints  were  confused  and 
frustrated  since  the  time  delay  for  the  recognition  to  appear  on  the  terminal  was  often 
slower  than  one  would  expect  for  keyboard  entry.  Likewise,  if  a  misrecognition 
occured  at  some  point  in  the  string  a  user  would  have  to  attempt  to  back  out  the 
command  or  cancel  it  and  start  the  entire  entry  over  again. 

The  design  of  NWISS  command  entry  procedures  has  some  unique  human 
engineering  advantages  for  keyboard  entry.  The  host  would  not  allow  a  command  to 
be  entered  if  it  did  not  form  a  correct  entry.  The  terminal  would  beep  and  inhibit  any 
incorrect  keystrokes.  The  user  could  type  a  question  mark  '?'  and  the  list  of  acceptable 
entries  would  be  listed.  Even  though  this  occurred  in  the  voice  entry  procedure  as  well 
the  user  would  be  disappointed  by  the  misrecognition  and  often  forget  the  voice 
command  'help'  which  would  output  '?'.  Eventually,  he  felt  more  hostility  and  mistrust 
toward  the  recognizer  and  got  flustered,  forgetting  which  set  he  was  in  and  eventually 
cancelling  the  entire  command  again. 

The  frustration  from  a  misrecognition  was  also  attributable  to  the  unfamiliarity 
with  words  in  the  sets  and  the  proper  NWISS  command  structure.  The  user  usually 
blamed  his  uncertainty  in  the  set  and  command  structure  on  himself  adding  to  more 
disappointment  and  disillusionment  with  the  recognizer.  In  later  trials,  a  combination 
of  voice  and  keyboard  was  used  by  some  subjects.  They  used  voice  for  certain  words 
and  commands  they  felt  comfortable  with  and  then  used  the  keyboard  for  the 
unfamiliar  commands  Or  for  entries  they  felt  required  immediate  and  correct  entry. 

There  could  not  be  any  determination  o(  advantages  in  utilizing  a  speech 
recognition  system  in  Navy  tactical  trainers  to  overcome  the  dead  time  in  learning  the 
game  command  keystrokes  and  entry  procedures.  The  human  engineering  in  the  design 
of  this  particular  wargaming  system  was  extremely  helpful  both  in  providing  assistance 
and  prompts,  as  well  as  accepting  as  few  as  four  keystrokes  for  certain  commands. 
Further  study  is  required  in  this  area. 

The  subjects  with  some  familiarity  with  wargaming  had  a  distinct  advantage  over 
those  who  did  not,  both  with  and  without  voice  entry.  This  advantage  could  not  be 
directly  attributed  to  the  voice  recognition  application  but  was  quite  evident  in  the 
level  of  play.  They  were  more  comfortable  at  the  input  terminal  and  were  relied  upon 
by  the  other  members  in  the  group  for  advice  to  interpret  the  displays. 
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H.      CONCLUSIONS  AND  RECOMMENDATIONS 

Even  though  initially  the  VOTAN  seemed  very  promising  and  an  excellent 
candidate  for  a  C  environment,  this  speech  recognizer  is  not  well  suited  for  CCWS.  It 
failed  because  the  vocabulary  limits  of  the  voice  card  and  the  processing  power  of  this 
recognizer  were  exceeded  by  the  demands  of  the  NWISS  vocabulary.  Consequently, 
the  recognition  and  output  speed  were  jeopardized.  The  large  1000  word  vocabulary 
and  real-time  processing  is  necessary  in  the  CCWS  application  for  data  base  queries. 
Additionally,  the  user  is  required  to  memorize  which  set  is  active  and  the  'switch' 
words  needed  to  enter  the  various  sets.  The  user  using  the  VOTAN  must  adapt  his 
speech  to  the  recognizer  which  is  unacceptable.  The  recognizer  must  be  an  extension 
of  the  commander  not  a  hinderance. 

The  combination  voice  and  keyboard  entry  employed  by  some  o[  our  subjects 
during  the  end  of  the  testing  indicates  a  possible  area  for  future  study.  The  application 
of  speech  entry  in  conjunction  with  keyboard,  mouse,  or  balltab  manipulation  should 
be  investigated.  The  balltab  is  the  exclusive  device  for  an  NTDS  console  in  a 
shipboard  command  and  control  center.  This  would  allow  a  smaller,  more  tailored 
vocabulary  integrated  into  existing  systems  to  aid  the  user,  particularly  if  that 
individual  must  be  positioned  at  a  console  or  terminal. 
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VI.  CONCLUSIONS 

It  is  intuitive  that  the  commander  who  can  manage  and  process  the  tremendous 
flow  of  battle  information  the  fastest  will  have  more  time  to  determine  a  response  or 
make  decisions  which  are  always  ahead  of  his  adversary.  As  the  dependency  of  the 
commander  on  computing  resources  increases,  it  is  only  natural  to  expect  greater 
demands  upon  the  man-machine  interface.  By  including  a  speech  recognition  system 
on  the  CCWS,  the  commander  would  realize  a  faster  information  processing  rate.  This 
would  result  in  the  commander  acquiring  more  knowledge  in  a  faster  time  on  which  to 
base  his  decision.  As  Sun  Tzu,  the  famous  Chou  Dynasty  philosopher  and  military 
strategist  once  stated  ".  .  .  knowledge  is  power  and  permits  the  wise  to  conquer  without 
bloodshed  and  to  accomplish  deeds  surpassing  all  others." 

This  thesis  evaluated  the  performance  of  a  state-of-the-art  1000  word  discrete 
template  matching  system  and  a  commercially  available  VOTAN  continuous  speech 
recognition  system.    The  requirements  specified  for  the  CCWS  were: 

Large  vocabulary  (capacity  >   1000  words). 

Real-time  response. 

Very  high  recognition  accuracy  (  >  98%). 

Adaptable  to  the  user,    (i.e.,  the  user  should  not  have  to  modify  or  alter  his 
speaking  rate  significantly) 

No  deterioration  in  accuracy  in  noisy  and  stressful  environments. 

The  systems  evaluated  in  this  thesis  did  not  fulfill  all  the  requirements  for  the  speech 

application  in  the  CCWS.    Each  system  had  its  advantages  and  disadvantages  which 

were  discussed  in  the  conclusion  of  each  respective  chapter.    Currently,  there  is  not  a 

system  commercially  available  capable  of  meeting  all  these  requirements. 

Even  though  neither  system  met  all  the  requirements  established  for  the  CCWS, 
recent  literature  reflects  the  improvements  in  the  Strategic  Computing  Program,  in 
particular,  phonetic  recognition.  Speech  systems  capable  of  meeting  and  exceeding 
these  specifications  are  not  far  away.  In  fact,  CINCPACFLT  is  scheduled  to  test  and 
evaluate  the  speech  recognition  system  being  developed  by  the  Strategic  Computing 
Program.   (Strategic  Computing,  1985) 

As  computers  become  more  and  more  capable  of  displaying,  storing,  and 
processing  information,  it  is  only  natural  to  assume  that  the  interface  between  the  user 
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and  computer  should  be. optimized.  We  all  can  recount  from  our  own  experiences,  ".  . 
.  the  costs  of  poorly  designed  interfaces.  Coming  in  many  forms,  the  cost  can  include 
degraded  user  productivity,  user  frustration,  increased  training  costs,  and  the  need  to 
redesign  .  .  ."  (Foley,  1984).  For  these  very  reasons,  the  design  of  even-  interface  for 
an  interactive  user-computer  must  be  of  utmost  importance.  Speech  recognition  has 
long  been  thought  of  as  the  ideal  interface  and  must  be  considered  for  all  future 
systems. 
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APPENDIX  A 
SRI  100  WORD  VOCABULARY 


a 

dinner 

manner 

rose 

able 

direction 

many 

round 

aboard 

discovered 

March 

run 

about 

distance 

mark 

running 

above 

do 

market 

said 

accept 

for 

Mary 

steps 

according 

foce 

material 

still 

account 

forced 

matter 

stock 

across 

foreign 

may 

stone 

act 

forget 

maybe 

stop 

both 

form 

our 

stopped 

bottom 

forty 

out 

store 

box 

forward 

outside 

story 

break 

found 

over 

straight 

bring 

for 

own 

street 

broken 

I'd 

page 

U.S. 

brought 

I'll 

paid 

under 

Brown 

I'm 

paper 

understand 

building 

I've 

Paris 

union 

built 

idea 

part 

university 

development 

ideas 

right-paren 

unless 

did 

if 

river 

until 

didn't 

immediately 

road 

up 

different 

important 

Robert 

difficult 

impossible 

room 
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APPENDIX  B 
240  WORD  VOCABULARY 


one 

yankee 

Gary_Poock 

carriage_return 

Iran 

S  wee den 

login_Poock 

accat_title 

load_gld3 

Poock_NPS_password 

three 

logout 

red_sphere 

zero 

November 

use_that_one 

Captain_Ebbert 

up_in_detail 

level_two_viewer 

genisco_zero_parameters 

five 

alpha 

charlie 

echo 

juliett 

move_it_left 

San_Francisco 

engineering 

voice_technology 

Russian_version_of_Hormuz 

eieht 


two 

air_routes 

load_the_gun 

load_the_server 

Japan 

Europe 

level_two 

strait_of_Hormuz 

connect_to_charlie 

change_directory_to_hunter 

four 

graphics 

steam_plant 

seven 

move_it_down 

spirograph 

close_out_charlie 

United_States 

North_Atlantic_.\lap 

M  editerranean_Chart 

six 

bravo 

delta 

foxtrot 

romeo 

sierra 

application 

human_factors 

central_expressway 

filc_transfer_protocol 

nine 
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hotel 

kilo 

oscar 

move_it_right 

Vietnam 

advisory 

business_meeting 

speech_recognition 

efficient_transmission 

golf 

quebec 

victor 

xray 

move_it_up 

Tokyo 

down_in_detail 

criteria 

suitability 

identification 

course 

command 

bingo 

proceed 

altitude 

relocate 

available 

track  esm 

command_and_control 

enemy_detection 

launch 

cancel 

bearing 

orders 

satellite 

negative 


india 

lima 

pappa 

uniform 

Korea 

interactive 

continuous 

continuous_speech 

system_integration 

mike 

tango 

whiskey 

zulu 

Bangladesh 

Hollister 

corporation 

advantages 

radiology 

automatic_recognition 

speed 

attack 

report 

station 

recover 

designate 

plot_esm 

designate_track 

probability 

probability_of_detection 

fire 

message 

label 

copy 

envelope 

correlate 
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combination 

maneuver_delay 

Task_Force_Commander 

proceed_to_New_Delhi 

time 

surface 

minefield 

shore_based 

execute 

enemy 

Connecticut 

Oklahoma 

California 

place_a_marker_on_Paris 

bingo_all_craft_immediately 

neutral 

sensor 

Stockton 

air_field_name 

track_friendly 

bearing_and_distance 

Minnesota 

Eisenhower 

relocate_the_Sunfish 

take 

Georgia 

Texas 

Utah 

latitude 

Ohio 

flight_controller 

Pango_Pango 

lay_a_barrier 

attack_barrier_target 

scope 


sensor_delay 

Alabama 

North_Carolina 

place_a_circle_on_M  oscow 

shoot 

refuel 

distance 

contact 

submarine 

order_name 

Indiana 

Pennsylvania 

South_Dakota 

map 

grid 

missile 

Adak 

New_York 

track_unknown 

track_neutral 

Louisiana 

Colorado 

Ne\v_Mexico 

refuel_the_Connie 

place 

Vermont 

Daniels 

platform 

longitude 

torpedo 

Trans_World_Airlines 

keep_on_station 

ground_control_approach 

Atlantic_Data_Base 

drop 
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Bangkok 

Brisbane 

Antwerp 

Arkansas 

user's_guide 

Acapulco 

Yokohama 

Diego_Garcia 

Pacific_Data_Base 

Maine 

Portland 

Aspro 

red_fox 

blue_force_one 

Baltimore 

Sevastopol 

chronometer 

plot_all_submarines 

Iberian  Carrier 


Bombay 
Canton 
Africa 
Saigon 
Kitty_Hawk 
Vladivostok 
Sea_of_Japan 
Indonesia 
Arabian_Tanker 
save 

Rangoon 
Kiev- 
Naples 
Calcutta 
Wyoming 
Honolulu 
John_Kennedy 
United_Air_Lines 
West_German_Torpedo 
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APPENDIX  C 
SCENARIO  BRIEFING 

FROM:       COMSEVENTH  FLEET 

TO:  COMMANDER.  TASK  GROUP  ONE  PT  ONE 

COMMANDER,  TASK  GROUP  ONE  PT  TWO 

OPORDER  00003 

1  THIS  MESSAGE  CONSTITUTES  AN  OPERATION  ORDER  FOR  CTG 
ONE  PT  ONE  AND  ONE  PT  TWO.  IT  CONSISTS  OF  GEOPOLITICAL 
BACKGROUND.  COMPELLING  EXECUTION  OF  OPERATION.  TASK 
FORCE  ORGANIZATION.  OPERATION  OBJECTIVES.  SUMMARY  OF 
OPPOSING  FORCES.  AND  DIRECTION  CONCERNING  CONDUCT  OF 
OPERATION. 

2  DURING  THE  LAST  48  HOURS  THE  CVBGS  HAVE  DRAWN  NEAR  TO 
EACH  OTHER  AND  NOW  MAY  BE  ORGANIZED  INTO  A  TASK  FORCE 
OF  CONSIDERABLE  SIZE.  AS  LIGHT  DAWNS  THE  JFK  HAS 
RECOVERED  THE  LATE  NIGHT  LAUNCH  WHICH  WAS  CYCLIC  DUE 
TO  THAT  CARRIERS  CLOSER  PASSAGE  TO  ENEMY  LAND  BASES 
AND  DUE  TO  THAT  THE  JAPANESE  ISLANDS  THAT  COULD  NOT  BE 
ASSUMED  TO  BE  FRIENDLY.  THE  AIR  COMPLEMENT  FIAS  BEEN  AT 
WORK  FOR  AT  LEAST  48  HOURS.  KITTY  ON  THE  OTHER  HAND 
HAS  JUST  LAUNCHED  A  CAP  GRID  WHICH  IS  PROCEEDING  TO 
POSITION.    IT  INCLUDES  AN  E2  AND  AN  S3. 

A.  AN  E3A  (AWACS)  WAS  SUPPOSED  TO  ARRIVE  ON  STATION  OUT 
OF  ADAK  ON  AN  AIR  FORCE  MISSION  ABOUT  ONE  HOUR  AGO. 
HOWEVER  SHE  HAD  NO  REPORTING  RESPONSIBILITY  TO  THE  OTC  AND 
HER  PRESENCE  HAS  NOT  AS  YET  BEEN  CONFIRMED.  P3S  ARE 
DEPLOYED  IN  SUPPORT  HOWEVER.      . 

TASK  GROUP  ONE  PT  ONE  CONSISTS  OF  THE  FOLLOWING  SHIPS 
LOCATED  12  HOURS  PRIOR  TO  THE  START  OF  YOUR  RUN  FOR  RECORD 
AS  FOLS: 
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USS  KITTYHAWK  46-30N/157E  (APPROX) 

USS  WICHITA 

USS  KNOX 

USS  SPRUANCE 

USS  RATHBURNE 

USS  WILSON 

USS  MCCORMICK 

USS  FOX 

USS  LOS  ANGELES 

USS  OMAHA  SOJ 

PATRON  FOUR  SIX  IN  PLACE  MISAWA  AB.  40-00N  141-50E. 

PATRON  SEVENTEEN  IN  PLACE,  ADAK  AB,  51-50N  176-30W. 

UNSUBORDINATED  AWACS  DET  IN  PLACE,  ADAK. 

CVBG  1.2,  JFK  TASK  GROUP  CONSISTS  OF  THE  FOLLOWING  UNITS: 

USS  JOHN  F.  KENNEDY  46-30N  155E  (APPROX) 

USS  IOWA 

USS  LONG  BEACH 

USS  JOHN  ROGERS 

USS  TURNER  JOY 

USS  JOHN  HANCOCK 

USS  MAC 

USS  FURER 

USS  GAR  (NEW  CONSTRUCTION  SSN)  SOJ 

4.    OPERATIONAL  OBJECTIVES:   (A  REPEAT) 

THE  SEA  OF  OKHOTSK  AND  THE  BASES  WHICH  SURROUND  IT  PROVIDE 
A  PRIMARY  SANCTUARY  FOR  THE  SOVIET  FAR  EASTERN  FLEET. 
PROCEED  TO  A  POSITION  FORM  WHICH  YOUR  COMBINED  FORCES  CAN 
INTERDICT  SURFACE  AND  SUBSURFACE  FORCES  AND  LAUNCH  STRIKES 
AGAINST  THE  SOVIET  LAND  BASED  AIR  STRONGHOLDS.  PREPARE  TO 
FIGHT  YOUR  WAY  IN  AND  STAY  AS  LONG  AS  POSSIBLE. 
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•  PRIMARY  MISSION  ONE 

PLAN  FOR  AND  BE  PREPARED  TO  CONDUCT  A  PREEMPTIVE  AIR  RAID 
ON  PETRO  WHEN  IN  POSITION  AND  WHEN  DIRECTED  BY  HIGHER 
AUTHORITY. 

•  PRIMARY  MISSION  TWO 

SEARCH  FOR.  IDENTIFY  AND  REPORT,  THE  SOVIET  MINSK  BG,  AND  ANY 
RED  SUBMARINES  WHICH  MAY  BE  ENCOUNTERED.  BE  PREPARED  TO 
CONDUCT  SHORT  NOTICE  PREEMPTIVE  ATTACK  ON  THESE  FORCES 
WHEN  DIRECTED  BY  HIGHER  AUTHORITY. 

5.  SUMMARY  OF  OPPOSING  FORCES:  ANTICIPATED  OPPOSING  FORCES 
CONSIST  OF  THE  SOVIET  TASK  GROUP  COMPRISED  OF: 

ONE  MINSK  CLASS  CGH 

ONE  KASHIN  CLASS  CGL 

ONE  KREST  II  CLASS  CG 

TWO  VICTOR  CLASS  SSN 

TWO  CHARLIE  CLASS  SSGN 

ONE  ECH02  CLASS  SSGN 

INTELLIGENCE        SOURCES        INDICATE        POSSIBILITY        THAT 

ADDITIONAL   SURFACE   UNITS   OF  UNKNOWN   TYPE    MAY   HAVE 

DEPARTED      VLADIVOSTOK      WITHIN      THE      PAST      36      HOURS. 

ALTHOUGH  THIS  IS  ,  AS  YET,  UNCONFIRMED. 

24  HOURS  PRIOR  TO  THE  START  OF  YOUR  RUN  FOR  RECORD,  THE 
SURFACE  FORCES  WERE  IN  THE  SEA  OF  OKHOTSK.    IT  IS  ANTICIPATED 

THAT  ONE  SUB  WILL  CONTINUE  WITH  THE  SOVIET  BG DURING  THE 

LAST  36  HOURS  ONE  HOSTILE  SSN  HAS  BEEN  DETECTED  IN  THE 
VICINITY  OF  KITTY.  EVASIVE  ACTION  AND  BEST  SPEED  MAY  HAVE 
LEFT  IT  BEHIND  FOR  THE  TIME  BEING,  HOWEVER.  SPEED  OF  TASK 
GROUP  ADVANCE  HAS  BEEN  SLOWED  AND  VIGILANCE  TO  THE  REAR  IS 
ADVISED.  THE  CONTACT  THOUGHT  TO  BE  SHADOWING  THE  JFK  WAS 
NEVER  CONFIRMED   BY  CVBG   FORCES  OR  THE   FURER  ON   HER  TRIP 
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NORTH.  THE  REMAINING  SUBS  ARE  EXPECTED  TO  BE  IN  POSITION  TO 
OPPOSE  YOUR  TRANSIT  NEAR  THE  ISLAND  PASSAGES  NOTHEAST  OF 
HOKKAIDO.  INTEL  STILL  ESTIMATES  THE  GREATEST  THREAT  WILL  BE 
FROM  (1)  LAND  BASED  AIR  OF  REGIMENTAL  SIZE  GROUPINGS.  AND  (2) 
FROM  SSNs  THAT  ARE  CURRENTLY  DEPLOYED  OR  WILL  DEPLOY 
SHORLY.  THE  SOVIET  TASK  GROUP  CAN  BE  EXPECTED  TO  OPPOSE 
ENTRY  TO  THE  SEA  OF  O  TO  SOME  DEGREE. 

6.  DIRECTION  CONCERNING  THE  CONDUCT  OF  THE  OPERATION:  THE 
CONDUCT  OF  THE  OPERATION  IS  AT  THE  DISCRETION  OF  THE  OFFICER 
IN  TACTICAL  COMMAND  WITHIN  THE  FOLLOWING  CONSTRAINTS  AND 
POLICY  GUIDANCE: 

1  DEFCON  CONDITION  TWO.  WE  ARE  NOT  AT  WAR.  IF  POSSIBLE. 
AVOID  ACTIONS  WHICH  COULD  PROVOKE  A  WAR.  CONFIRM  AS 
EARLY  AS  POSSIBLE  WHICH  COMMANDER  CVBG  1.1  OR  CVBG  1.2. 
WILL  BE  OTC.  KITTY  IS  STILL  THE  ONLY  SHIP  WITH  KEYING 
MATERIAL  NECESSARY  TO  GAIN  LAND  BASED  AIR  SUPPORT 
FROM  ADAK  (THIRD  FLEET)  AND  MISAWA  (SEVENTH  FLEET). 
EXPECT  LATE  BREAKING  GUIDANCE  FROM  THIS  HEADQUARTERS 
AS  EVENTS  IN  EUROPE  COULD  SIGNAL  THE  START  OF  ACTIONS  IN 
THIS  THEATRE. 

2  WEAPONS  ARE  TIGHT  AT  THIS  TIME.  WEAPONS  FREE  STATUS 
MUST  BE  REQUESTED  FROM  ORIG  UNLESS  ATTACKED,  IN  WHICH 
CASE  RESPONSE  IN  KIND  ONLY  IS  AUTHORIZED.  THAT  IS  TO  SAY 
THAT  THE  LOSS  OF  AN  AIRCRAFT  MAY  NOT  BE  RESPONDED  TO 
BY  AN  ATTACK  ON  A  SHIP.    MINIMIZE  ESCALATING  ACTIONS. 

•  THE  FIRST  CHALLENGE  WILL  BE  TO  ORGANIZE  THE  COMBINED 
TASK  GROUP  INTO  AN  EFFICIENT  FIGHTING  UNIT.  NOTIFY  THIS 
HEADQUARTERS  OF  ALL  SIGNIFICANT  DECISIONS.  YOUR  PLAN 
OF  OPERATIONS,  IN  BRIEF,  IS  OF  PRIMARY  INTEREST. 

•  TO  ENSURE  SUSTAINABILITY  IN  THE  EVENT  OF  A  PROTRACTED 
CAMPAIGN  ONLY  36  AIRCRAFT  MAY  BE  AIRBORNE  AT  ANY  GIVEN 
TIME  FROM  EACH  CARRIER  (TOTAL  OF  72).  THIS  DOES  NOT 
INCLUDE  LAND  BASED  P3s  OR  AWACS  AC  UNDER  THE  CONTROL 
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OF  THE  CARRIER.  PERMISSION  TO  USE  THIRD  FLEET  ASSESTS 
MUST  BE  GAINED  FROM  THIRD  FLEET.  VIA  SEVENTH  FLEET. 
PRIOR  TO  ISSUING  A  LAUNCH  COMMAND. 

SUBMIT   YOUR   PLAN   OF   ACTION   PRIOR   TO   THE    RUN    FOR    RECORD 
CONTAINING: 

1  BELIEVED  ENEMY  INTENTIONS: 

2  YOUR  INTENTIONS: 

3  CONTINGENCY  PLANS: 
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APPENDIX  D 
VOTAN  VOCABULARY  FOR  NWISS 


This  file  is  the  vocabularies  set  up  for  the  interactive  battle  group  game  in  the 
war  lab. 
COMMON  WORDS  SET 

001  COMMON 
WRONG 
ENTER 
HELP 
DISPLAY 

COMMANDS  JTO_UNITS 
NUMBERS 
AVIATION 

SHIPS 
LOAD 

TASK  WORDS  SET 

002  TASK_WORDS 
GO_TO_SLEEP 
LISTEN_TO_ME 
INITIALIZE 
VERIFY 

003  WRONG,HS  =  \0B,CM 

004  ENTER,HS  =  \OD,CM 

006   HELP,HS  =  ?,CM 
DISPLA  Y  WORDS  SET 


008    DISPLAY,CM 

CANCEL,  HS  =  CANCEL\20 
CIRCLE,HS  =  CIRCLE  20 
GRID,HS  =  GRID\20 


RADIUS,HS=  RADIUS  20 
SHIFT,HS  =  SHIFT\20 

DESIGNATE,!  IS  =  DESIGNATE\20 
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XM  ARK.HS  =  XMARK\20 

CENTER,HS  =  CENTER  ,20 

FORCE,HS  =  FORCE\20 

POSITION.HS  =  POSITION  20 

DROP,HS  =  DROP  20 

ERASE, HS  =  ERASE, 20 

ESM.HS=ESM,20 

PLOT.HS  =  PLOT  20 

LINE_OF_BEARING_SONAR.HS=  LOB_SONAR  20 

LINE_OF_BEARING_ESM.HS  =  LOB_ESM\20 

COMMANDS  TO  UMTS  SET 

009    COMMANDS_TO_UNTTS.CM 
TIME.HS  =  TIME  20 
AIR.HS  =  AIR  20 
RADAR,HS=RADAR\20 
EMITTER,HS  =  EM  ITTER  20 
ALTITUDE.HS  =  ALTITUDEl20 
BEARING.HS  =  BEARING\20 
POSITION. HS=  POSITION  20 
BLIP_ON,HS  =  BLIP  ON\20 
COURSE. HS  =  COURSE\  20 
OFF,HS  =  OFF\20 

DESIGNATE. HS=  DESIGNATE  20 
FRIENDLY,HS=  FRIENDLY  20 
UNKNOWN. LIS  =  UNKNOWN  20 
EXECUTE, HS=  EXECUTE  20 
LAUNCH. HS  =  LAUNCH  20 
PERISCOPE.HS  =  PERISCOPE  20 
CHAFF,HS=RBOC  20 
SUBMARINE. HS  =  SUBMARINE  20 
HANDOVER.HS  =  HANDOVER  20 
JOIN.HS  =  JOIN  20 
RECOVER,HS=  RECOVER  20 
SEARCH. HS  =  SEARCH  20 


BEARING.HS  =  BEARING  20 
BACKSPACE,HS  =  \08 
SPACE,HS  =  \20 
TR.ACK.HS  =  TR.ACK  20 
OLD.HS=OLD,20 
SONAR.HS  =  SONAR  20 
PLACE  A,HS=  PLACE  20 


ACTIVATE. HS  =  ACTIVATE  20 
SURFACE. HS=  SURFACE  20 
ESM.HS=ESM  20 
SONAR,HS=  SONAR  20 
BARR1ER,HS=  BARRIER  20 
FORCE. HS=FORCE\20 
TRACK. HS  =  TRACK'  20 
BLIP_OFF,HS  =  BLIP  OFF\20 
COVER,HS  =  COVER  20 
DEPTH, HS  =  DEPTH  20 
ENEMV,HS=  ENEMY  20 
NEUTRAL.HS  =  NEUTRAL  20 
EMCON.HS  =  EMCON  20 
FIRE.HS=FIRE  20 
ORDERS. HS  =  ORDERS  20 
PROCEED. HS=  PROCEED  20 
REFUEL. I  IS  =  REFUEL  20 
CEASE. HS  =  CEASE  20 
INFORM, HS=  INFORM  20 
RECALL. HS=  RECALL  20 
REPORT.HS=  REPORT  20 
SILENCE. HS=  SILENCE  20 
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TURN,HS  =  TURN\20 

SPACE,HS  =  \20 

SPEED,HS  =  SPEED\20 

TAKE,HS  =  TAKE\20 

ON.HS  =  ON  20 

DECEPTIVE  JTOUNTER_\IEASURES.I  IS  =  DECM\20 

WEAPONS_FREE,HS  =  WEAPONS  FREE\20 

WEAPONS  JTIGHT,HS  =  WEAPONS  TIGHT'  20 

NUMBERS  SET 


USE,HS=USE\20 
BACKSPACE.HS  =  \08 
STATION.HS  =  STATION\20 
ALL,HS  =  ALL  20 


010  NUMBERS.CM 
ONE,HS-l 
THREE,HS  =  3 

FIVE.HS  =  5 
■     SEVEN.HS  =  7 
NINER,HS  =  9 
POINT.HS=. 
SOUTH,HS  =  S\20 
WEST,HS«W\20 
ALPHA.HS  =  A 
CHARLIE, HS  =  C 
ECHO,HS  =  E 
GOLF,HS  =  G 
INDIA,HS=I 
KILO.HS=K 
MIKE.HS=M 
OSCAR,HS  =  0 
QUEBEC,HS  =  Q 
SIERRA,HS  =  S 
UNIFORM, I  IS  =  U 
WHISKEY,HS  =  W 
YANKEE.HS  =  Y 
SPACE,HS  =  \20 

AVIATION  SET 


TWO,HS-2 

FOUR.HS-4 

SIX.HS  =  6 
EIGHT,HS  =  8 
ZERO,HS  =  0 
NORTH,HS  =  N20 
EAST,HS=E20 
TACK,HS  =  - 
BRAVO,HS=B 
DELTA, I  IS  =D 
FOXTROT,HS=F 
HOTEL, I  IS  =11 
JULLIET,HS  =  J 
LIMA,HS=L 
NOVEMBER,IIS  =  N 
PAPA,HS=P 
ROMEO,HS=R 
TANGO,IIS  =  T 
VICTOR. HS.=  V 
X-RAY.HS-X 
ZULU,HS  =  Z 
BACKSPACE,HS  =  \08 
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Oil   AVIATION.CM 

ALTITUDE,HS  =  ALTITUDE\20 
BEARING, HS  =  BEARING  20 
POSITION. HS=  POSITION  20 
BINGO.HS  =  BINGO\20 
COURSE.HS  =  COURSEi20 
FIRE.HS  =  FIRE'20 
LAUNCH, HS=  LAUNCH, 20 
AEW,HS  =  AEW  20 
ASW.HS  =  ASW(20 
RECONN.HS  =  RECONN  20 
RESCUE,HS=  RESCUE  20 
STRIKE_CAP.HS  =  STRCAP  20 
SURCAP.HS=  SURCAP  20 
JAMMER,HS  =  JAMMER, 20 
NONE.HS  =  NONE\20 
SPEED, HS=  SPEED  \20 
PROCEED,HS  =  PROCEED\20 
STOP,HS  =  STOP\20 
FOR,HS  =  FOR  20 
CH46,HS  =  CH46\20 
E3A,HS  =  E3A  20 
EP3E.HS  =  EP3E  20 
FA18.HS=FA1S,20 
P3C.HS=P3C  20 
LAMPS.HS=SH2F  20 
SPACE,HS  =  \20 


BARRIER,HS=  BARRIER\20 
FORCE.HS=  FORCE' 20 
TRACK,HS  =  TRACK\20 
TO,HS  =  TO\20 

COVER,HS  =  COVER' 20 

AT,HS  =  AT\20 

MISSION, HS=  MISSION  20 

AIRTANKER.HS  =  AIRTANKER  20 

DECOY, HS=  DECOY  20 

RELAY.HS=  RELAY  20 

SEARCH. IIS  =  SEARCH  20 

SURVEILANCE,HS  =  SURVEI  LANCE  20 

CAP.HS  =  CAP  20 

STRIKE.HS  =  STRIKE  20 

REFUEL, HS=  REFUEL  20 

TAKE,HS=TAKE\20 

STATION, HS=  STATION\20 

A6E,HS  =  A6E\20 

A7E.HS  =  A7E  20 

E2C,HS  =  E2C\20 

EA6B,HS  =  EA6B  20 

F14A,HS=F14A  20 

KA6D,HS=KA6D  20 

S3A.HS  =  S3A  20 

SH3H.HS  =  SH3H  20 

BACKSPACE.HS  =  \03 


SHIPS  SET 

012   SHIPS.NS  =  COMMANDS_TO_UNTTS.CT=  1,CM 
KITTYHAWK,HS=  FOR  KITTY  20 
FOX.HS=FOX  20 
\VILSON.HS=  FOR  WILSO ,20 
SPRUANCE.I1S=  FOR  SPRUA\20 
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KNOX,HS=  FOR  KNOX\20 
WONSAN,HS=  FOR  WONSA\20 
LOS_ANGLES,HS=FOR  LOSAN\20 
MISSAWA,HS=  FOR  MISAW\20 
ADAK,HS=  FOR  ADAK\20 
JFK,HS=FORJFK\20 
R.K.TURNER,HS=  FOR  TURNR\20 
MAC.HS=FOR  MAC\20 
FL"RER,HS  =  FOR  FURER\20 
IOWA,HS=  FOR  IOWA\20 
LONGBEACH,HS=FOR  LONGB\20 
IOWA.HS=FOR  IOWA  20 
GAR,HS=FOR  GAR  20 
PETRO.HS=  FOR  PETRO.20 
OMAHA,HS=  FOR  OMAHA\20 
JOHN_ROGERS,HS=FOR  ROGER\20 
RATHBOURNE,HS=FOR  RATHB\20 
WICHITAU,HS=  FOR  WICHI\20 
ALEKSIUV,HS  =  FOR  ALEKS\20 
VLADIVOSTOK'S  -  FOR  VLAD\20 
MCCORMICK,HS=  FOR  MCCOR\20 
JOHN_HANCOCK,HS=  FOR  HANCK\20 

LOAD  SET  {WEAPON  SET) 

013    LOAD, HS=  LOAD  20,CM 
HARPOOX,HS  =  HRPON\20 
TLAM,HS  =  TLAM\20 
ASROC,HS  =  ASROC\20 
MARK46A,HS=MK46A\20 
MARK57,HS=MK57\20 
MARK83,HS=MK83\20 
76MILLIMETER,HS=MM76\20 
ROCKEYE.HS=  RKEYE\20 
SPARROWS  =  SPAR  ,20 
WALLEYE, US  =  WALLI  20 


TASM,HS  =  TASM  20 
APAM,HS  =  APAM  20 
MARK46,IIS=MK46  20 
MARK48,HS=MK48\20 

MARK82,HS=MK82  20 
MARKS4,HS=MK84  20 
PHIONEX,HS  =  PIIENX\20 
SHRIKE, IIS  =  SHRIK  20 
SIDEWINDER,HS  =  S\YDR  20 
SM2ER,HS=SM2ER  20 
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ONE,HS=l  TW0,HS=2 

THREE,HS  =  3  F0UR,HS  =  4 

FIVE,HS  =  5  SIX.HS  =  6 

SEVEN. HS  =  7  EIGHTHS  =  8 

NINER,HS  =  9  ZERO.HS  =  0 

PINGER.HS  =  SSQ47\20  DIFAR.HS  =  SSQ53\20 

DICASS.HS  =  SSQ62  20  SPACE. HS  =  \ 20 

BACKSPACE. HS=  OS 

STANDARD_EXTENDED_R.ANGE.HS  =  STDER  20 
STANDARD  MEDIUM   RANGE. HS  =  STDMR  20 
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